2005-04-11 17:14:37

by Daniel Walker

[permalink] [raw]
Subject: RT and Signals


I'm not sure if this has changed at all in recent RT patches, but I've
noticed several issues popping up that are related to the timer
interrupt sending signals , one in particular is the fact that
send_sig() calls into __cache_alloc() which has it's interrupt disable
protections removed in RT . I've observed slab corruption due to this
while running lmbench and LTP .

Another issue was a livelock related to the timer interrupt calling
send_sig which locks tasklist_lock and siglock , which are both mutexes
(deadlock detect was on , but didn't trigger)..

LTP and lmbench seem to bring all these issues to the surface, but they
are all different depending on the architecture. I've been treating the
symptoms , but not the disease .. Ultimately , we need some protections,
in signal deliver, to stop the timer interrupt ..

Daniel


2005-04-11 18:06:21

by john cooper

[permalink] [raw]
Subject: Re: RT and Signals

Daniel Walker wrote:
> I'm not sure if this has changed at all in recent RT patches, but I've
> noticed several issues popping up that are related to the timer
> interrupt sending signals...

I've also seen BUG asserts kicking in on PPC (40-04-ish) in
signal delivery [actual receipt] paths. These have only been
under fairly heavy load conditions and presumably is hitting
an infrequent path in force_sig_info() IIRC. Haven't had the
time yet to resolve these but they are on the list.

-john


--
[email protected]

2005-04-11 18:37:54

by Daniel Walker

[permalink] [raw]
Subject: Re: RT and Signals

On Mon, 2005-04-11 at 11:03, john cooper wrote:
> Daniel Walker wrote:
> > I'm not sure if this has changed at all in recent RT patches, but I've
> > noticed several issues popping up that are related to the timer
> > interrupt sending signals...
>
> I've also seen BUG asserts kicking in on PPC (40-04-ish) in
> signal delivery [actual receipt] paths. These have only been
> under fairly heavy load conditions and presumably is hitting
> an infrequent path in force_sig_info() IIRC. Haven't had the
> time yet to resolve these but they are on the list.

Ingo's solution was to remove the signal delivery from the timer
interrupt altogether. Which I don't know if that's acceptable. It seems
like there is a never ending stream of bugs related to this..

Daniel