2019-08-14 16:56:47

by Rich Felker

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] waitid: Add support for waiting for the current process group

On Wed, Aug 14, 2019 at 06:34:44PM +0200, Christian Brauner wrote:
> On Wed, Aug 14, 2019 at 06:15:17PM +0200, Christian Brauner wrote:
> > On Wed, Aug 14, 2019 at 06:09:17PM +0200, Oleg Nesterov wrote:
> > > On 08/14, Christian Brauner wrote:
> > > >
> > > > and a signal could come in between the system call that
> > > > retrieved the process gorup and the call to waitid that changes the
> > > ^^^^^
> > > > current process group.
> > >
> > > I noticed this typo only because I spent 2 minutes or more trying to
> > > understand this sentence ;) But yes, a signal handler or another thread
> >
> > I'll try to rewrite it. :)
>
> Ok, here's what I changed it to:
>
> It was recently discovered that the linux version of waitid is not a
> superset of the other wait functions because it does not include
> support for waiting for the current process group. This has two
> downsides:
> 1. An extra system call is needed to get the current process group.
> 2. After the current process group is received and before it is passed
> to waitid a signal could arrive causing the current process group to change.

I don't think "downsides" sufficiently conveys that this is hard
breakage of a requirement for waitpid. How about something like the
following?

"It was recently discovered that the linux version of waitid is not a
superset of the other wait functions because it does not include
support for waiting for the current process group. Userspace cannot
simply emulate this functionality with an additional getpgid syscall
due to inherent race conditions that violate the async-signal safety
requirements for waitpid."

Rich


2019-08-14 17:08:39

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] waitid: Add support for waiting for the current process group

On Wed, Aug 14, 2019 at 9:55 AM Rich Felker <[email protected]> wrote:
>
> I don't think "downsides" sufficiently conveys that this is hard
> breakage of a requirement for waitpid.

Well, let's be honest here. Who has _ever_ seen a signal handler
changing the current process group?

In fact, the SYSV version of setpgid() takes a process ID to set it
*for somebody else*, so the signal safety is not even necessarily
relevant, since it might be racing with _another_ thread doing it
(which even the kernel side won't fix - it's just user space doing odd
things).

So yes - it's technically true that it's impossible to emulate
properly in user space.

But I doubt it makes _any_ difference what-so-ever, and glibc might as
well do something like

ret = waitid(P_PGID, 0, ..);
if (ret == -EINVAL) { do the emulation }

which makes it work with older kernels, and has zero downside in practice.

Hmm?

Linus

2019-08-14 18:01:44

by Rich Felker

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] waitid: Add support for waiting for the current process group

On Wed, Aug 14, 2019 at 10:06:19AM -0700, Linus Torvalds wrote:
> On Wed, Aug 14, 2019 at 9:55 AM Rich Felker <[email protected]> wrote:
> >
> > I don't think "downsides" sufficiently conveys that this is hard
> > breakage of a requirement for waitpid.
>
> Well, let's be honest here. Who has _ever_ seen a signal handler
> changing the current process group?
>
> In fact, the SYSV version of setpgid() takes a process ID to set it
> *for somebody else*, so the signal safety is not even necessarily
> relevant, since it might be racing with _another_ thread doing it
> (which even the kernel side won't fix - it's just user space doing odd
> things).

For that case, the operations are inherently unordered with respect to
each other, and assuming the interpretation that waitpid is allowed to
wait on "the pgid at the time of the call" rather than at the time of
child exit/status-change -- which was discussed thoroughly in the
thread leading up to this patch -- there is no conformance
distinction.

On the other hand, with changing your own pgid from a signal handler,
there is a clear observable ordering between the events. For example,
if the signal handler changes the pgid and forks a child with the new
pgid, waitpid for "own pgid" can be assumed to include the new child
in its wait set.

I agree this is not common usage, so impact of breakage is probably
low, but I'd rather not have wrong/racy hacks be something we're
committed to supporting indefinitely on the userspace side.

> So yes - it's technically true that it's impossible to emulate
> properly in user space.
>
> But I doubt it makes _any_ difference what-so-ever, and glibc might as
> well do something like
>
> ret = waitid(P_PGID, 0, ..);
> if (ret == -EINVAL) { do the emulation }
>
> which makes it work with older kernels, and has zero downside in practice.
>
> Hmm?

It only affects RV32 anyway; other archs all have a waitpid syscall
that can be used. Since there's not yet any official libc release with
RV32 support and AIUI the ABI is not considered "frozen" yet,
emulation doesn't seem useful here. Whatever kernel version fixes this
(or some later one, if nobody gets things together on upstreaming libc
support of RV32) will just become the minimum version for RV32.

Rich