by Eric W. Biederman

[permalink] [raw]

Subject: Re: [PATCH 0/3] signal: requeuing undeliverable signals

Kyle Huey <[email protected]> writes:

> On Mon, Nov 15, 2021 at 9:31 PM Eric W. Biederman <[email protected]> wrote:
>>
>>
>> Kyle Huey recently reported[1] that rr gets confused if SIGKILL prevents
>> ptrace_signal from delivering a signal, as the kernel setups up a signal
>> frame for a signal that rr did not have a chance to observe with ptrace.
>>
>> In looking into it I found a couple of bugs and a quality of
>> implementation issue.
>>
>> - The test for signal_group_exit should be inside the for loop in get_signal.
>> - Signals should be requeued on the same queue they were dequeued from.
>> - When a fatal signal is pending ptrace_signal should not return another
>> signal for delivery.
>>
>> Kyle Huey has verified[2] an earlier version of this change.
>>
>> I have reworked things one more time to completely fix the issues
>> raised, and to keep the code maintainable long term.
>>
>> I have smoke tested this code and combined with a careful review I
>> expect this code to work fine. Kyle if you can double check that
>> my last round of changes still works for rr I would appreciate it.
>
> This still fixes the race we reported.

>
> Tested-by: Kyle Huey <[email protected]>

Thank you very much for retesting.

Eric

2021-11-18 06:12:39

by Marko Mäkelä

[permalink] [raw]

Subject: Re: [PATCH 0/3] signal: requeuing undeliverable signals

On Wed, Nov 17, 2021 at 6:51 PM Eric W. Biederman <[email protected]> wrote:
>
> Kyle Huey <[email protected]> writes:
>
> > On Mon, Nov 15, 2021 at 9:31 PM Eric W. Biederman <[email protected]> wrote:
> >>
> >>
> >> Kyle Huey recently reported[1] that rr gets confused if SIGKILL prevents
> >> ptrace_signal from delivering a signal, as the kernel setups up a signal
> >> frame for a signal that rr did not have a chance to observe with ptrace.
> >>
> >> In looking into it I found a couple of bugs and a quality of
> >> implementation issue.
> >>
> >> - The test for signal_group_exit should be inside the for loop in get_signal.
> >> - Signals should be requeued on the same queue they were dequeued from.
> >> - When a fatal signal is pending ptrace_signal should not return another
> >> signal for delivery.
> >>
> >> Kyle Huey has verified[2] an earlier version of this change.
> >>
> >> I have reworked things one more time to completely fix the issues
> >> raised, and to keep the code maintainable long term.
> >>
> >> I have smoke tested this code and combined with a careful review I
> >> expect this code to work fine. Kyle if you can double check that
> >> my last round of changes still works for rr I would appreciate it.
> >
> > This still fixes the race we reported.
>
> >
> > Tested-by: Kyle Huey <[email protected]>
>
> Thank you very much for retesting.
>
> Eric

Thank you, Kyle and Eric, for reporting and fixing the root cause of this race.

Meanwhile, I followed Kyle's suggestion and will disable the crash
handlers in the tracee whenever it is being traced.

Marko
--
Marko Mäkelä, Lead Developer InnoDB
MariaDB Corporation