2006-10-16 03:04:29

by Gareth Knight

[permalink] [raw]
Subject: [PATCH] generic signal code (small new feature - userspace signal mask), kernel 2.6.16

I looked in MAINTAINERS for a suitable person for the generic signal
code, but couldn't find anyone in particular. Please Cc me on
comments, which are most welcome, as I am not on LKML, although I do
peruse the archives.

Gareth.


Userspace signal mask
==================

Inspiration and rationale:

This is a proposed addition to the linux kernel to reduce the
overhead required to mask signals. The intended usage is an
application with critical sections that need to be guarded against
deadlock by preventing signals from being delivered whilst inside one
of the critical sections. Currently such applications may be very
heavy users of the sigprocmask syscall, this proposal provides an
additional signal mask stored in userspace that can be updated with a
simple store rather than a syscall.

The userspace sigprocmask is activated via a new subcommand of the
sigprocmask or rt_sigprocmask syscall. The first argument is
currently an integer indicating which operation to perform.
Currently the operations are SIG_BLOCK, SIG_UNBLOCK and SIG_SETMASK;
the new operation I've called SIG_SETUSERMASKADDRESS.

Usage is:

sigprocmask( SIG_SETUSERMASKADDRESS, sigset_t *address, NULL);

The *address pointer points to wherever you've decided to keep the
userspace sigmask in your thread. Any non-zero address enables the
use of the userspace sigprocmask and conversely, setting the address
to zero disables it. On a kernel without support, this operation can
be expected to return -EINVAL.

Behaviour:

Each thread has its own address, but there is no restriction on
threads in a process pointing to the same address.

By default, a process does not have a userspace sigmask. When a new
thread is created or a process is forked, the new thread inherits any
userspace sigmask address from its parent.

At any time the userspace sigmask is enabled, the effective sigmask
is the union of the userspace sigmask and the traditional sigmask in
the kernel for that thread. The traditional kernel sigmask is set
and controlled by the SIG_BLOCK etc operations as it was before.

As soon as the relevant bit is set in the userspace sigmask, that
signal is blocked in the same way it would be as if you'd blocked it
via a traditional sigprocmask call. If a signal is both blocked and
also pending for delivery, clearing the bit in the userspace sigmask
does not guarantee immediate delivery as a SIG_UNBLOCK call would -
instead delivery typically occurs at the next pre-emption point (any
syscall for example).

If a bad address is passed to the kernel for the location of the
userspace mask, it is discarded.

Patch:

This patch is against a vanilla 2.6.16 kernel from kernel.org. The
modifications are entirely architecture independent. It has been
tested on i386, x86_64, ia64 and ppc64 kernels:

--- pristine-linux-2.6.16/include/linux/sched.h 2006-03-19
21:53:29.000000000 -0800
+++ linux-2.6.16-userspacesigmask/include/linux/sched.h 2006-10-09
18:09:24.000000000 -0700
@@ -807,6 +807,7 @@
struct sighand_struct *sighand;
sigset_t blocked, real_blocked;
+ sigset_t __user *userspace_blocked; /* If non-zero, task has opted
to maintain an additional blocked mask in userspace */
sigset_t saved_sigmask; /* To be restored with TIF_RESTORE_SIGMASK */
struct sigpending pending;
--- pristine-linux-2.6.16/kernel/fork.c 2006-03-19 21:53:29.000000000
-0800
+++ linux-2.6.16-userspacesigmask/kernel/fork.c 2006-10-05
22:10:13.000000000 -0700
@@ -987,6 +987,7 @@
spin_lock_init(&p->alloc_lock);
spin_lock_init(&p->proc_lock);
+ p->userspace_blocked = current->userspace_blocked;
clear_tsk_thread_flag(p, TIF_SIGPENDING);
init_sigpending(&p->pending);
--- pristine-linux-2.6.16/kernel/signal.c 2006-03-19
21:53:29.000000000 -0800
+++ linux-2.6.16-userspacesigmask/kernel/signal.c 2006-10-14
21:08:38.000000000 -0700
@@ -24,6 +24,7 @@
#include <linux/ptrace.h>
#include <linux/posix-timers.h>
#include <linux/signal.h>
+#include <linux/mm.h>
#include <linux/audit.h>
#include <linux/capability.h>
#include <asm/param.h>
@@ -224,7 +225,30 @@
void recalc_sigpending(void)
{
- recalc_sigpending_tsk(current);
+ sigset_t usermask, combinedblocked;
+
+ sigemptyset(&usermask);
+
+ if (current->userspace_blocked != 0)
+ {
+ /* Task is using the optional userspace signal mask, fetch it
from the address given to us. */
+ /* If the address is bad, we don't mind and the mask remains
empty */
+ if (copy_from_user(&usermask, current->userspace_blocked, sizeof
(sigset_t)))
+ {
+ /* If we have a bad address, disable the userspace mask */
+ current->userspace_blocked = 0;
+ }
+ }
+
+ sigorsets(&combinedblocked,&usermask,&current->blocked);
+
+ if (current->signal->group_stop_count > 0 ||
+ (freezing(current)) ||
+ PENDING(&current->pending, &combinedblocked) ||
+ PENDING(&current->signal->shared_pending, &combinedblocked))
+ set_thread_flag(TIF_SIGPENDING);
+ else
+ clear_thread_flag(TIF_SIGPENDING);
}
/* Given the mask, find the first available signal that should be
serviced. */
@@ -914,6 +938,7 @@
if (sigismember(&t->blocked, sig)) {
sigdelset(&t->blocked, sig);
}
+
recalc_sigpending_tsk(t);
ret = specific_send_sig_info(sig, info, t);
spin_unlock_irqrestore(&t->sighand->siglock, flags);
@@ -1921,7 +1946,7 @@
{
sigset_t *mask = &current->blocked;
int signr = 0;
-
+
relock:
spin_lock_irq(&current->sighand->siglock);
for (;;) {
@@ -2136,6 +2161,15 @@
if (set) {
error = -EFAULT;
+
+ /* SIG_SETUSERMASK just needs to record the address of the extra
mask in userspace and return */
+ if ( how == SIG_SETUSERMASKADDRESS )
+ {
+ current->userspace_blocked = set;
+ error = 0;
+ goto out;
+ }
+
if (copy_from_user(&new_set, set, sizeof(*set)))
goto out;
sigdelsetmask(&new_set, sigmask(SIGKILL)|sigmask(SIGSTOP));
@@ -2607,6 +2641,9 @@
break;
case SIG_SETMASK:
current->blocked.sig[0] = new_set;
+ break;
+ case SIG_SETUSERMASKADDRESS:
+ current->userspace_blocked = (sigset_t *) set;
break;
}
--- pristine-linux-2.6.16/include/asm-generic/signal.h 2006-03-19
21:53:29.000000000 -0800
+++ linux-2.6.16-userspacesigmask/include/asm-generic/signal.h
2006-10-09 18:13:36.000000000 -0700
@@ -7,6 +7,9 @@
#ifndef SIG_SETMASK
#define SIG_SETMASK 2 /* for setting the signal mask */
#endif
+#ifndef SIG_SETUSERMASKADDRESS
+#define SIG_SETUSERMASKADDRESS 8 /* for setting the optional
userspace signal mask address */
+#endif
#ifndef __ASSEMBLY__
typedef void __signalfn_t(int);


2006-10-16 03:37:49

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] generic signal code (small new feature - userspace signal mask), kernel 2.6.16



On Sun, 15 Oct 2006, Gareth Knight wrote:
>
> I looked in MAINTAINERS for a suitable person for the generic signal code, but
> couldn't find anyone in particular. Please Cc me on comments, which are most
> welcome, as I am not on LKML, although I do peruse the archives.

That's a truly horribly disfigured patch - your whitespace is all screwed
up.

Anyway, the whole approach is not doable. At all.

Why? You're doing user-space accesses from within critical sections with a
spinlock, and that's just a big no-no. Think page faults, swapping etc.

That's ignoring all the issues with the fact that doing the user accesses
during recalc_sigpending is broken for other reasons, namely that we don't
even _do_ the signal pending recalculation all the time, just when we
"know" things may have changed. So your approach would miss updates to the
user-space masks.

So the whole approach is flawed.

You _could_ try to make it do something special at signal delivery time,
to see if delivery can be delayed at that point, but quite frankly, it's
going to be nasty there too (and that's going to be a disaster for the
whole issue of non-thread-specific signals, which have been steered to one
thread, and then the new mask would say that they can't be accepted by
that thread after all).

Quite frankly, you'd probably be better off trying to do totally different
approaches. For example, it would be possible to block all signals
entirely, and then just create a new system call that uses a _synchronous_
delivery method to avoid races with async delivery. Preferably a file
descriptor, so that you can select/poll on it.

That was discussed at some point. See for example:

http://groups.google.com/group/linux.kernel/browse_thread/thread/1332715ae3e26b9/1f3fc521db812a07?lnk=st&q=&rnum=1&hl=en#1f3fc521db812a07

which I found by just googling for "synchronous signal queue" with me as
the author. That's from almost four years ago, and nobody ever got quite
excited enough about it to actually take it any further, but I still think
it's a lot better than the alternatives like yours..

Linus

2006-10-16 03:50:48

by Gareth Knight

[permalink] [raw]
Subject: Re: [PATCH] generic signal code (small new feature - userspace signal mask), kernel 2.6.16

Hey Linus,

Thanks for the lightning fast response !

I take your points - the reason I favoured trying userspace access
approach was to keep the feature portable. I've seen this feature in
other *nix kernels but always done with the mask kept on some sort of
syspage shared between user and kernel space, at a fixed address but
contents local to each thread. I would love to add such a feature to
Linux - perhaps keep the tid, timeofday and other popular things on
there as well; however I felt that sort of work was beyond my kernel
hacking abilities right now. I guess it could either be done as an
outright special case or part of supporting a MAP_LOCAL style of mmap
on Linux.

I did google for MAP_LOCAL and syspage, but I didn't see any
promising avenues in terms of previous work I could pickup.

Gareth.

>
>
> On Sun, 15 Oct 2006, Gareth Knight wrote:
>>
>> I looked in MAINTAINERS for a suitable person for the generic
>> signal code, but
>> couldn't find anyone in particular. Please Cc me on comments,
>> which are most
>> welcome, as I am not on LKML, although I do peruse the archives.
>
> That's a truly horribly disfigured patch - your whitespace is all
> screwed
> up.
>
> Anyway, the whole approach is not doable. At all.
>
> Why? You're doing user-space accesses from within critical sections
> with a
> spinlock, and that's just a big no-no. Think page faults, swapping
> etc.
>
> That's ignoring all the issues with the fact that doing the user
> accesses
> during recalc_sigpending is broken for other reasons, namely that
> we don't
> even _do_ the signal pending recalculation all the time, just when we
> "know" things may have changed. So your approach would miss updates
> to the
> user-space masks.
>
> So the whole approach is flawed.
>
> You _could_ try to make it do something special at signal delivery
> time,
> to see if delivery can be delayed at that point, but quite frankly,
> it's
> going to be nasty there too (and that's going to be a disaster for the
> whole issue of non-thread-specific signals, which have been steered
> to one
> thread, and then the new mask would say that they can't be accepted by
> that thread after all).
>
> Quite frankly, you'd probably be better off trying to do totally
> different
> approaches. For example, it would be possible to block all signals
> entirely, and then just create a new system call that uses a
> _synchronous_
> delivery method to avoid races with async delivery. Preferably a file
> descriptor, so that you can select/poll on it.
>
> That was discussed at some point. See for example:
>
> http://groups.google.com/group/linux.kernel/browse_thread/thread/
> 1332715ae3e26b9/1f3fc521db812a07?
> lnk=st&q=&rnum=1&hl=en#1f3fc521db812a07
>
> which I found by just googling for "synchronous signal queue" with
> me as
> the author. That's from almost four years ago, and nobody ever got
> quite
> excited enough about it to actually take it any further, but I
> still think
> it's a lot better than the alternatives like yours..
>
> Linus

2006-10-16 13:28:52

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] generic signal code (small new feature - userspace signal mask), kernel 2.6.16

Linus Torvalds <[email protected]> writes:
>
> Why? You're doing user-space accesses from within critical sections with a
> spinlock, and that's just a big no-no. Think page faults, swapping etc.

He could pin the page in memory like futexes do. One page pinned
per thread shouldn't be a big DOS issue either.

-Andi

2006-10-16 23:15:48

by Nicholas Miell

[permalink] [raw]
Subject: Re: [PATCH] generic signal code (small new feature - userspace signal mask), kernel 2.6.16

On Mon, 2006-10-16 at 15:28 +0200, Andi Kleen wrote:
> Linus Torvalds <[email protected]> writes:
> >
> > Why? You're doing user-space accesses from within critical sections with a
> > spinlock, and that's just a big no-no. Think page faults, swapping etc.
>
> He could pin the page in memory like futexes do. One page pinned
> per thread shouldn't be a big DOS issue either.
>
> -Andi

Whatever the final solution ends up being (assuming this actually goes
somewhere), it should be extensible enough for other uses (for example,
some sort of "please don't preempt me right now" hint for use with
pthread spinlocks.)


--
Nicholas Miell <[email protected]>

2006-10-17 09:24:13

by George Spelvin

[permalink] [raw]
Subject: Re: [PATCH] generic signal code (small new feature - userspace signal mask), kernel 2.6.16

> This is a proposed addition to the linux kernel to reduce the
> overhead required to mask signals. The intended usage is an
> application with critical sections that need to be guarded against
> deadlock by preventing signals from being delivered whilst inside one
> of the critical sections. Currently such applications may be very
> heavy users of the sigprocmask syscall, this proposal provides an
> additional signal mask stored in userspace that can be updated with a
> simple store rather than a syscall.

Wouldn't a simpler solution be to provide a way to cancel signal
delivery after it's hit user-space?

Then user space can keep its own block mask, which is maintained as
a superset of the kernel block mask. Then a critical section would,
in the fast path, proceed like:

- Block signals -> Noted in user-space only
- Do critical section (part 1)
- Do critical section (part 2)
- Unblock signals -> Noted in user-space only
- More code

And if something bad happened

- Block signals -> Noted in user-space only
- Do critical section (part 1)
- Signal arrives
- Test against user-space mask
- Tell kernel about all blocked signals -> Note new kernel mask
- Return "please try again later" from signal handler
- Do critical section (part 2)
- Unblock signals -> Note that it's less than kernel mask
- Tell kernel about newly unblocked signals
- Signal arrives (again)
- Test against user-space mask
- Call registered signal handler
- Return "signal handled"
- More code

This does do the whole signal delivery dance twice if it gets unlucky,
but keeps a conceptually simpler kernel interface.

The one thing that might be complicated is pthread signal delivery.
"Please try again later" could need to go back to the process layer and
immediately re-deliver it to a different thread.

2006-10-17 12:32:13

by Gareth Knight

[permalink] [raw]
Subject: Re: [PATCH] generic signal code (small new feature - userspace signal mask), kernel 2.6.16

A way to cancel delivery / push things back onto the signal queue
would indeed be a worthy addition. The way I would see that working
would be where you passed a special or new parameter to sigreturn()
to indicate you wanted that signal requeued, presumably the
implication would be that is was also added to the blocked mask at
the same time.

The scheme for critical sections described below is a decent
compromise, but it does require more application change than just
speeding up sigprocmask. My idea was that glibc could take advantage
of the sigprocmask speedup when it was available and thus you
automatically roll out the benefit to existing applications.

Linus pretty much mauled the suggested patch, but I don't think its
quite as bad as it may first appear. The suggestion from Andi to
pin / lock the page (inspired from the futex code) I think solves a
number of edge cases right off the bat and is straightforward to do.
The concern over a copy_from_user whilst holding the &current-
>sighand->siglock I think is just a case of ensuring
recalc_sigpending is not called in that situation - it doesn't seem
necessary to hold the siglock to recalc the pending signals.

I also re-read the signal code and whilst it is quite true that
recalc_sigpending is not called in every path, the omissions seem to
me to be places that would be irrelevant anyway - in the case of
forcing signal delivery for instance, where you wish to ignore any mask.

I do agree that in the case of unblocking a pending signal with my
scheme, delivery is not instantaneous (typically the next syscall),
but I would argue that this is only the case for signals which would
not have been temporally deterministic anyway - some synchronous
event such as a SIGBUS will cause pre-emption in a deterministic
manner as always.

Suggestions and comments most welcome,

Gareth.


>> This is a proposed addition to the linux kernel to reduce the
>> overhead required to mask signals. The intended usage is an
>> application with critical sections that need to be guarded against
>> deadlock by preventing signals from being delivered whilst inside one
>> of the critical sections. Currently such applications may be very
>> heavy users of the sigprocmask syscall, this proposal provides an
>> additional signal mask stored in userspace that can be updated with a
>> simple store rather than a syscall.
>
> Wouldn't a simpler solution be to provide a way to cancel signal
> delivery after it's hit user-space?
>
> Then user space can keep its own block mask, which is maintained as
> a superset of the kernel block mask. Then a critical section would,
> in the fast path, proceed like:
>
> - Block signals -> Noted in user-space only
> - Do critical section (part 1)
> - Do critical section (part 2)
> - Unblock signals -> Noted in user-space only
> - More code
>
> And if something bad happened
>
> - Block signals -> Noted in user-space only
> - Do critical section (part 1)
> - Signal arrives
> - Test against user-space mask
> - Tell kernel about all blocked signals -> Note new kernel mask
> - Return "please try again later" from signal handler
> - Do critical section (part 2)
> - Unblock signals -> Note that it's less than kernel mask
> - Tell kernel about newly unblocked signals
> - Signal arrives (again)
> - Test against user-space mask
> - Call registered signal handler
> - Return "signal handled"
> - More code
>
> This does do the whole signal delivery dance twice if it gets unlucky,
> but keeps a conceptually simpler kernel interface.
>
> The one thing that might be complicated is pthread signal delivery.
> "Please try again later" could need to go back to the process layer
> and
> immediately re-deliver it to a different thread.