LinuxLists.cc - hexagon: signal handling bugs

2012-02-11 23:27:45

Subject: hexagon: signal handling bugs

1) unless I'm seriously misreading hexagon vm_entry.S, once we'd
called do_notify_resume, we do *not* recheck if there's more work to
be done. In particular, if we'd got more signals pending - e.g. from
SIGSEGV we'd got while trying to set a sigframe up. That's not a good
thing for a lot of reasons. Note that once that is fixed, you'll need
to make sure that restart logics is applied only *once* - not for subsequent
sigframes.

2) sigreturn should *NOT* be restartable at all, or you'll get
a lot of hard-to-reproduce fun. AFAICS, the right thing to do would
be to set ->syscall_nr to something negative, not __NR_rt_sigreturn in there
(and return regs->r00 instead of 0, to avoid the check in do_trap0).
Reason: suppose you have e.g. a loop that runs through different values in
r00. No syscalls in it. A signal comes and gets treated on the way
out of kernel after the next e.g. timer interrupt. We save the values
of registers into a sigframe and go into the handler. Another signal
comes; the next time we reach the kernel is when we get to rt_sigreturn().
There we
* have registers restored from on-stack sigframe
* have ->syscall_nr set to __NR_rt_sigreturn
* hit handle_signal() on the way out
* check ->r00; note that this is the value we have just restored
from sigframe. If it happens to be -ERESTARTNOHAND, silently replace it
with -EINTR. And eventually return to clueless userland.
Result: if timer interrupt happens when the userland loop goes through
-ERESTARTNOHAND, the value of r00 is clobbered upon return from signal handler.
Fun debugging that kind of bugs: priceless...

3) on sigreturn we need to set current->restart_block.fn; hexagon
doesn't do that.

4) altstack can overflow; you don't check that. See
commit 83bd01024b1fdfc41d9b758e5669e80fca72df66
Author: Roland McGrath <[email protected]>
Date: Wed Jan 30 13:30:06 2008 +0100
for an analog.

5) try_to_freeze() has no business being called in do_signal();
let get_signal_to_deliver() handle that.

I don't have any access to hardware *or* cross-toolchain, so all I can do
here is RTFS and review. For #2..#5 I would not be afraid to post untested
patches (they are essentially trivial), but #1 is well beyond my arrogance
threshold - messing in assembler glue on an architecture I don't know *and*
can't test... Sorry.

2012-02-15 17:45:23

by Richard Kuo

[permalink] [raw]

Subject: Re: hexagon: signal handling bugs

On Sat, Feb 11, 2012 at 11:27:40PM +0000, Al Viro wrote:
> I don't have any access to hardware *or* cross-toolchain, so all I can do
> here is RTFS and review. For #2..#5 I would not be afraid to post untested
> patches (they are essentially trivial), but #1 is well beyond my arrogance
> threshold - messing in assembler glue on an architecture I don't know *and*
> can't test... Sorry.

Thanks for reviewing the code. Admittedly, it's been a while since I looked
at the signal handling bits, so I will need to review it again and test the
changes where appropriate. I'll post a patch or something for review when
ready...

Thanks again,
Richard Kuo

--

Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

2012-02-15 18:18:30

by Linas Vepstas

[permalink] [raw]

Subject: Re: hexagon: signal handling bugs

On 11 February 2012 17:27, Al Viro <[email protected]> wrote:
> 1) unless I'm seriously misreading hexagon vm_entry.S, once we'd
> called do_notify_resume, we do *not* recheck if there's more work to
> be done.

> 2) sigreturn should *NOT* be restartable at all, or you'll get
....

Is there an architecture that "does this best"? I (re-)wrote a
significant part of the hexagon sig-handling code, and I did it via
"surgical hacking": by reading the code for 4-6 other arches, and
trying to pick out the "best practices" of each. But its easy to
build a Frankenstein when using body parts from mis-matched places, so
perhaps many of these mistakes were mine...

Perhaps its desirable to modify most/all architectures, so as to evoke
more commonality? Some of the other, more minor arches sure did look
incomplete to me...

-- Linas