LinuxLists.cc - [RFC / musing] Scoped exception handling in Linux userspace?

2013-07-19 00:26:35

Subject: [RFC / musing] Scoped exception handling in Linux userspace?

Windows has a feature that I've wanted on Linux forever: stack-based
(i.e. scoped) exception handling. The upshot is that you can do,
roughly, this (pseudocode):

int callback(...)
{
/* Called if code_that_may_fault faults. May return "unwind to
landing pad", "propagate the fault", or "fixup and retry" */
}

void my_function()
{
__hideous_try_thing(callback) {
code_that_may_fault();
} blahblahblah {
landing_pad_code();
}
}

Windows calls it SEH (structured exception handling), and the
implementation on 32-bit Windows is rather gnarly. I don't really
know how it works on 64-bit windows, but I think it's saner.

This has two really nice properties:

1. It works in libraries!

2. It's localized. So you can mmap something, read from it *and
handle SIGBUS*, and unmap.

Could Linux support such a thing? Here's a sketch of a way:

- The kernel would need to have a fairly well-defined concept of
synchronous faults that can be handled with this mechanism. Calls to
force_sig_info are probably the right thing to hook in to.

- The userspace runtime optionally registers (via a new syscall or
prctl, say) a handler for synchronous faults.

- When a synchronous fault happens, if the process (struct
sighand_struct) has a synchronous fault handler registered, the signal
is delivered to that handler, on the thread that faulted, instead of
via the normal signal handling mechanism.

- The userspace runtime walks the chain of personality handlers and
gives them a chance to respond.

- If no handler claims the fault, then the user code somehow* causes
ordinary signal delivery to happen.

* This may need kernel help, too -- if the process is going to die, it
should die for the right reason, so perhaps there should be a syscall
to redeliver the signal. If the runtime wants to be fancy and a
signal handler is installed, then there could be a fast path. Maybe
if we got really fancy, it could live in the vdso.

Now everyone wins! After someone writes the libgcc support for this
(ugh!), then you can write CFI-based exception handlers in assembly!
Presumably you could write them in C++, too, if you don't care about
restarting, like this:

try {
code_that_may_fault();
} catch (cxxabi::synchronous_kernel_fault &) {
amazingly_dont_crash();
}

Is this worth persuing? I'm not touching the gcc part with a ten-foot
pole, but I could probably do some of the kernel work. I'm a bit
scared of libgcc, too.

It's worth noting that SIGBUS isn't the only interesting signal here.
SIGFPE could work, too. I'm not sure whether SIGPIPE would make
sense. SIGSEGV would clearly work, but anyone using this mechanism
for SIGSEGV is probably asking for trouble.

--Andy

P.S. Just because you can probably get away with throwing a C++
exception from a signal handler right now does not mean it's a good
idea. Especially in a library.

2013-07-19 00:40:20

by David Daney

[permalink] [raw]

Subject: Re: [RFC / musing] Scoped exception handling in Linux userspace?

On 07/18/2013 05:26 PM, Andy Lutomirski wrote:
> Windows has a feature that I've wanted on Linux forever: stack-based
> (i.e. scoped) exception handling. The upshot is that you can do,
> roughly, this (pseudocode):
>
> int callback(...)
> {
> /* Called if code_that_may_fault faults. May return "unwind to
> landing pad", "propagate the fault", or "fixup and retry" */
> }
>
> void my_function()
> {
> __hideous_try_thing(callback) {
> code_that_may_fault();
> } blahblahblah {
> landing_pad_code();
> }
> }

How is this different than throwing exceptions from a signal handler?

GCC already supports this on many architectures running on the Linux kernel.

You can do it from C using incantations like those found in the GCC
testsuite's gcc/testsuite/gcc.dg/cleanup-9.c file.

From C++ it is even easier, it is just a normal exception.

David Daney

>
> Windows calls it SEH (structured exception handling), and the
> implementation on 32-bit Windows is rather gnarly. I don't really
> know how it works on 64-bit windows, but I think it's saner.
>
> This has two really nice properties:
>
> 1. It works in libraries!
>
> 2. It's localized. So you can mmap something, read from it *and
> handle SIGBUS*, and unmap.
>
> Could Linux support such a thing? Here's a sketch of a way:
>
> - The kernel would need to have a fairly well-defined concept of
> synchronous faults that can be handled with this mechanism. Calls to
> force_sig_info are probably the right thing to hook in to.
>
> - The userspace runtime optionally registers (via a new syscall or
> prctl, say) a handler for synchronous faults.
>
> - When a synchronous fault happens, if the process (struct
> sighand_struct) has a synchronous fault handler registered, the signal
> is delivered to that handler, on the thread that faulted, instead of
> via the normal signal handling mechanism.
>
> - The userspace runtime walks the chain of personality handlers and
> gives them a chance to respond.
>
> - If no handler claims the fault, then the user code somehow* causes
> ordinary signal delivery to happen.
>
> * This may need kernel help, too -- if the process is going to die, it
> should die for the right reason, so perhaps there should be a syscall
> to redeliver the signal. If the runtime wants to be fancy and a
> signal handler is installed, then there could be a fast path. Maybe
> if we got really fancy, it could live in the vdso.
>
> Now everyone wins! After someone writes the libgcc support for this
> (ugh!), then you can write CFI-based exception handlers in assembly!
> Presumably you could write them in C++, too, if you don't care about
> restarting, like this:
>
> try {
> code_that_may_fault();
> } catch (cxxabi::synchronous_kernel_fault &) {
> amazingly_dont_crash();
> }
>
> Is this worth persuing? I'm not touching the gcc part with a ten-foot
> pole, but I could probably do some of the kernel work. I'm a bit
> scared of libgcc, too.
>
> It's worth noting that SIGBUS isn't the only interesting signal here.
> SIGFPE could work, too. I'm not sure whether SIGPIPE would make
> sense. SIGSEGV would clearly work, but anyone using this mechanism
> for SIGSEGV is probably asking for trouble.
>
>
> --Andy
>
> P.S. Just because you can probably get away with throwing a C++
> exception from a signal handler right now does not mean it's a good
> idea. Especially in a library.
>
>

2013-07-19 00:50:41

by Andy Lutomirski

[permalink] [raw]

Subject: Re: [RFC / musing] Scoped exception handling in Linux userspace?

On Thu, Jul 18, 2013 at 5:40 PM, David Daney <[email protected]> wrote:
> On 07/18/2013 05:26 PM, Andy Lutomirski wrote:
>>
>> Windows has a feature that I've wanted on Linux forever: stack-based
>> (i.e. scoped) exception handling. The upshot is that you can do,
>> roughly, this (pseudocode):
>>
>> int callback(...)
>> {
>> /* Called if code_that_may_fault faults. May return "unwind to
>> landing pad", "propagate the fault", or "fixup and retry" */
>> }
>>
>> void my_function()
>> {
>> __hideous_try_thing(callback) {
>> code_that_may_fault();
>> } blahblahblah {
>> landing_pad_code();
>> }
>> }
>
>
> How is this different than throwing exceptions from a signal handler?

Two ways. First, exceptions thrown from a signal handler can't be
retries. Second, and more importantly, installing a signal handler in
a library is a terrible idea.

--Andy

2013-07-19 01:17:53

by David Daney

[permalink] [raw]

Subject: Re: [RFC / musing] Scoped exception handling in Linux userspace?

On 07/18/2013 05:50 PM, Andy Lutomirski wrote:
> On Thu, Jul 18, 2013 at 5:40 PM, David Daney <[email protected]> wrote:
>> On 07/18/2013 05:26 PM, Andy Lutomirski wrote:
>>>
>>> Windows has a feature that I've wanted on Linux forever: stack-based
>>> (i.e. scoped) exception handling. The upshot is that you can do,
>>> roughly, this (pseudocode):
>>>
>>> int callback(...)
>>> {
>>> /* Called if code_that_may_fault faults. May return "unwind to
>>> landing pad", "propagate the fault", or "fixup and retry" */
>>> }
>>>
>>> void my_function()
>>> {
>>> __hideous_try_thing(callback) {
>>> code_that_may_fault();
>>> } blahblahblah {
>>> landing_pad_code();
>>> }
>>> }
>>
>>
>> How is this different than throwing exceptions from a signal handler?
>
> Two ways. First, exceptions thrown from a signal handler can't be
> retries.

??

> Second, and more importantly, installing a signal handler in
> a library is a terrible idea.

The signal handler would be installed by main() before calling into the
library. You have to have a small amount of boiler plate code to set it
up, but the libraries wouldn't have to be modified if they were already
exception safe.

FWIW the libgcj java runtime environment uses this strategy for handling
NullPointerExceptions and DivideByZeroError(sp?). Since all that code
for the most part follows the standard C++ ABIs, it is an example of
this technique that has been deployed in many environments.

David Daney

2013-07-19 03:30:16

by Andy Lutomirski

[permalink] [raw]

Subject: Re: [RFC / musing] Scoped exception handling in Linux userspace?

On Thu, Jul 18, 2013 at 6:17 PM, David Daney <[email protected]> wrote:
> On 07/18/2013 05:50 PM, Andy Lutomirski wrote:
>>
>> On Thu, Jul 18, 2013 at 5:40 PM, David Daney <[email protected]>
>> wrote:
>>>
>>> On 07/18/2013 05:26 PM, Andy Lutomirski wrote:
>>>
>>>
>>> How is this different than throwing exceptions from a signal handler?
>>
>>
>> Two ways. First, exceptions thrown from a signal handler can't be
>> retries.
>
>
> ??

s/retries/retried, by which I mean that you can't do things like
implementing virtual memory in userspace by catching SIGSEGV, calling
mmap, and resuming.

>
>
>> Second, and more importantly, installing a signal handler in
>> a library is a terrible idea.
>
>
> The signal handler would be installed by main() before calling into the
> library. You have to have a small amount of boiler plate code to set it up,
> but the libraries wouldn't have to be modified if they were already
> exception safe.
>
> FWIW the libgcj java runtime environment uses this strategy for handling
> NullPointerExceptions and DivideByZeroError(sp?). Since all that code for
> the most part follows the standard C++ ABIs, it is an example of this
> technique that has been deployed in many environments.

Other way around: a *library* that wants to use exception handling
can't do so safely without the cooperation, or at least understanding,
of the main program and every other library that wants to do something
similar. Suppose my library installs a SIGFPE handler and throws
my_sigfpe_exception and your library installs a SIGFPE handler and
throws your_sigfpe_exception. The result: one wins and the other
crashes due to an unhandled exception.

In my particular usecase, I have code (known to the main program) that
catches all kinds of fatal signals to log nice error messages before
dying. That means that I can't use a library that handles signals for
any other purpose. Right now I want to have a small snippet of code
handle SIGBUS, but now I need to coordinate it with everything else.

If this stuff were unified, then everything would just work.

--Andy

2013-07-19 05:50:16

by Tristan Gingold

[permalink] [raw]

Subject: Re: [RFC / musing] Scoped exception handling in Linux userspace?

On Jul 19, 2013, at 2:26 AM, Andy Lutomirski wrote:

> Windows has a feature that I've wanted on Linux forever: stack-based
> (i.e. scoped) exception handling. The upshot is that you can do,
> roughly, this (pseudocode):

[...]

Indeed Windows and OpenVMS have such a mechanism. That's clean and
library friendly, but please read:
https://www.usenix.org/conference/wiess-2000/c-exception-handling-ia64
to understand how it hurts optimization.

(And no, raising an exception from an handler doesn't always work,
due to optimizations allowed by the gcc exception mechanism).

Regards,
Tristan.

2013-07-19 16:22:19

by David Daney

[permalink] [raw]

Subject: Re: [RFC / musing] Scoped exception handling in Linux userspace?

On 07/18/2013 08:29 PM, Andy Lutomirski wrote:
> On Thu, Jul 18, 2013 at 6:17 PM, David Daney <[email protected]> wrote:
>> On 07/18/2013 05:50 PM, Andy Lutomirski wrote:
>>>
>>> On Thu, Jul 18, 2013 at 5:40 PM, David Daney <[email protected]>
>>> wrote:
>>>>
>>>> On 07/18/2013 05:26 PM, Andy Lutomirski wrote:
>>>>
>>>>
>>>> How is this different than throwing exceptions from a signal handler?
>>>
>>>
>>> Two ways. First, exceptions thrown from a signal handler can't be
>>> retries.
>>
>>
>> ??
>
> s/retries/retried, by which I mean that you can't do things like
> implementing virtual memory in userspace by catching SIGSEGV, calling
> mmap, and resuming.
>
>>
>>
>>> Second, and more importantly, installing a signal handler in
>>> a library is a terrible idea.
>>
>>
>> The signal handler would be installed by main() before calling into the
>> library. You have to have a small amount of boiler plate code to set it up,
>> but the libraries wouldn't have to be modified if they were already
>> exception safe.
>>
>> FWIW the libgcj java runtime environment uses this strategy for handling
>> NullPointerExceptions and DivideByZeroError(sp?). Since all that code for
>> the most part follows the standard C++ ABIs, it is an example of this
>> technique that has been deployed in many environments.
>
> Other way around: a *library* that wants to use exception handling
> can't do so safely without the cooperation, or at least understanding,
> of the main program and every other library that wants to do something
> similar. Suppose my library installs a SIGFPE handler and throws
> my_sigfpe_exception and your library installs a SIGFPE handler and
> throws your_sigfpe_exception. The result: one wins and the other
> crashes due to an unhandled exception.
>
> In my particular usecase, I have code (known to the main program) that
> catches all kinds of fatal signals to log nice error messages before
> dying. That means that I can't use a library that handles signals for
> any other purpose. Right now I want to have a small snippet of code
> handle SIGBUS, but now I need to coordinate it with everything else.
>
> If this stuff were unified, then everything would just work.

That's right. But I think the Linux kernel already supplies all the
needed functionality to do this. It is really a matter of choosing a
userspace implementation and standardizing your entire system around it.
In the realm of GNU/GLibc/Linux, it is really more of social/political
exercise rather than a technical problem.

David Daney

2013-07-19 16:29:31

by Joseph Myers

[permalink] [raw]

Subject: Re: [RFC / musing] Scoped exception handling in Linux userspace?

On Thu, 18 Jul 2013, Andy Lutomirski wrote:

> 2. It's localized. So you can mmap something, read from it *and
> handle SIGBUS*, and unmap.

There is of course no guarantee that possibly faulting memory accesses are
preserved (GCC should never introduce such an access where it wouldn't
occur in the abstract machine, but may well *remove* accesses that aren't
required). Hopefully you don't want to rely on a guarantee that faults
will happen....

> It's worth noting that SIGBUS isn't the only interesting signal here.
> SIGFPE could work, too. I'm not sure whether SIGPIPE would make

The SIGFPE case could potentially be relevant for C bindings to IEEE
754-2008 (clause 8) alternate exception handling, where the expectation is
that various forms of exception handling attribute can be associated with
a block. Though what such bindings will end up looking like, and to what
extent the various cases involved will correspond to things that could be
implemented using SIGFPE, remains to be seen - checking floating-point
exception flags after each operation could turn out to be a better
implementation approach in some cases. The C floating-point group hasn't
got as far as designing such bindings yet (currently working on parts 1-3
of draft TS 18661, and optional attributes are to go in part 5).

(Note: although I'm interested in these floating-point issues and am
reviewing all the parts of the draft C bindings as they become available,
I don't have current plans for implementing them - or the remaining bits
of C99/C11 Annex F and Annex G not yet implemented in GCC. There's
certainly a lot of work to be done there in both GCC and glibc, with the
exception handling attributes from part 5, and rounding mode attributes
from part 1, probably among the larger pieces.)

--
Joseph S. Myers
[email protected]

2013-07-19 19:07:53

by Andy Lutomirski

[permalink] [raw]

Subject: Re: [RFC / musing] Scoped exception handling in Linux userspace?

On Fri, Jul 19, 2013 at 9:22 AM, David Daney <[email protected]> wrote:
> On 07/18/2013 08:29 PM, Andy Lutomirski wrote:
>>
>> Other way around: a *library* that wants to use exception handling
>> can't do so safely without the cooperation, or at least understanding,
>> of the main program and every other library that wants to do something
>> similar. Suppose my library installs a SIGFPE handler and throws
>> my_sigfpe_exception and your library installs a SIGFPE handler and
>> throws your_sigfpe_exception. The result: one wins and the other
>> crashes due to an unhandled exception.
>>
>> In my particular usecase, I have code (known to the main program) that
>> catches all kinds of fatal signals to log nice error messages before
>> dying. That means that I can't use a library that handles signals for
>> any other purpose. Right now I want to have a small snippet of code
>> handle SIGBUS, but now I need to coordinate it with everything else.
>>
>> If this stuff were unified, then everything would just work.
>
>
> That's right. But I think the Linux kernel already supplies all the needed
> functionality to do this. It is really a matter of choosing a userspace
> implementation and standardizing your entire system around it. In the realm
> of GNU/GLibc/Linux, it is really more of social/political exercise rather
> than a technical problem.
>

The social problem could be solved by glibc (or maybe ld.so)
installing the relevant handlers automatically and taking advantage of
its sigaction wrapper to keep everything working. But this has
technical problems:

1. Semantic changes: things like kill(pid, SIGSEGV) will no longer
result in a fatal signal, which would be a regression (albeit probably
harmless). The results from /proc/pid/status might look a bit odd.
Separating out signals resulting from faulting instructions (vs other
causes) might be tricky. I'm also not sure whether the ignored states
of SIGSEGV and SIGFPE are preserved across exec, but, if they are,
glibc will have trouble emulating this.

2. Unhandled signals: if SIGSEGV is handled (by, say, glibc) but there
is no exception handler that claims the signal, then there's currently
no way to tell the kernel to do everything it normally does on an
unhandled fatal signal (e.g. logging, dumping core correctly,
notifying ptracers, sending the right failure code to waitid).

--Andy