2005-09-29 07:54:33

by Peter Zijlstra

[permalink] [raw]
Subject: Re: 2.6.13-rc6-rt9

On Sat, 2005-08-20 at 17:24 -0400, Jeff Dike wrote:
> On Sat, Aug 20, 2005 at 09:27:25PM +0200, Peter Zijlstra wrote:
> > Jeff, could you help us out here?
> > What exactly does uml need to get out of the calibrate delay loop?
>
> Interrupts, it's not too demanding :-)
>
> If it's not seeing VTALRM, then it will never leave the calibration loop.
>
> Try stracing it and see what it's getting.

Sorry for the late reply.

Yes, that does seem to be the problem.

Even with a current -rt (2.6.14-rc2-rt5) UML does not run. The issue is
indeed (as jeff pointed out) that VTALRM is never send. The small test
programm below illustrates this.

On a non-rt kernel it completed in 1 second.
On a -rt kernel it waits at infinitum.

Kind regards,

Peter Zijlstra

---------------------------

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/time.h>
#include <signal.h>

volatile int quit = 0;

void sig_vtalrm(int signr, siginfo_t * si, void * arg)
{
if (signr == SIGVTALRM) quit = 1;
}

int main()
{
struct itimerval ival = {{0,0}, {1, 0}};

struct sigaction sa;
sa.sa_sigaction = sig_vtalrm;
sigemptyset(&sa.sa_mask);
sa.sa_flags = 0;
sigaction(SIGVTALRM, &sa, NULL);

setitimer(ITIMER_VIRTUAL, &ival, NULL);

printf("wait\n");
while (!quit) ;
printf("done\n");
}


--
Peter Zijlstra <[email protected]>


2005-09-30 01:00:09

by Paul E. McKenney

[permalink] [raw]
Subject: Re: 2.6.13-rc6-rt9

On Thu, Sep 29, 2005 at 09:54:23AM +0200, Peter Zijlstra wrote:
> On Sat, 2005-08-20 at 17:24 -0400, Jeff Dike wrote:
> > On Sat, Aug 20, 2005 at 09:27:25PM +0200, Peter Zijlstra wrote:
> > > Jeff, could you help us out here?
> > > What exactly does uml need to get out of the calibrate delay loop?
> >
> > Interrupts, it's not too demanding :-)
> >
> > If it's not seeing VTALRM, then it will never leave the calibration loop.
> >
> > Try stracing it and see what it's getting.
>
> Sorry for the late reply.
>
> Yes, that does seem to be the problem.
>
> Even with a current -rt (2.6.14-rc2-rt5) UML does not run. The issue is
> indeed (as jeff pointed out) that VTALRM is never send. The small test
> programm below illustrates this.
>
> On a non-rt kernel it completed in 1 second.
> On a -rt kernel it waits at infinitum.

Will play with it and see what I broke...

Thanx, Paul

> Kind regards,
>
> Peter Zijlstra
>
> ---------------------------
>
> #include <stdio.h>
> #include <stdlib.h>
> #include <unistd.h>
> #include <sys/time.h>
> #include <signal.h>
>
> volatile int quit = 0;
>
> void sig_vtalrm(int signr, siginfo_t * si, void * arg)
> {
> if (signr == SIGVTALRM) quit = 1;
> }
>
> int main()
> {
> struct itimerval ival = {{0,0}, {1, 0}};
>
> struct sigaction sa;
> sa.sa_sigaction = sig_vtalrm;
> sigemptyset(&sa.sa_mask);
> sa.sa_flags = 0;
> sigaction(SIGVTALRM, &sa, NULL);
>
> setitimer(ITIMER_VIRTUAL, &ival, NULL);
>
> printf("wait\n");
> while (!quit) ;
> printf("done\n");
> }
>
>
> --
> Peter Zijlstra <[email protected]>
>
>

2005-09-30 01:06:48

by Thomas Gleixner

[permalink] [raw]
Subject: Re: 2.6.13-rc6-rt9

On Thu, 2005-09-29 at 18:00 -0700, Paul E. McKenney wrote:
> > Even with a current -rt (2.6.14-rc2-rt5) UML does not run. The issue is
> > indeed (as jeff pointed out) that VTALRM is never send. The small test
> > programm below illustrates this.
> >
> > On a non-rt kernel it completed in 1 second.
> > On a -rt kernel it waits at infinitum.
>
> Will play with it and see what I broke...

Paul,

you are not the culprit :)

The run_posix_cpu_timers(p) call is #ifdef'd out with PREEMPT_RT.

Thats a hard to fix issue.

It can not be run from hardirq context, as it takes a lot of locks
(especially our favorites: tasklist_lock and sighand->siglock). :(

Maybe another playground for rcu, but it might also be solved by some
other mechanism for accounting and delayed execution in the PREEMPT_RT
case.

tglx


2005-09-30 01:46:04

by Paul E. McKenney

[permalink] [raw]
Subject: Re: 2.6.13-rc6-rt9

On Fri, Sep 30, 2005 at 03:07:29AM +0200, Thomas Gleixner wrote:
> On Thu, 2005-09-29 at 18:00 -0700, Paul E. McKenney wrote:
> > > Even with a current -rt (2.6.14-rc2-rt5) UML does not run. The issue is
> > > indeed (as jeff pointed out) that VTALRM is never send. The small test
> > > programm below illustrates this.
> > >
> > > On a non-rt kernel it completed in 1 second.
> > > On a -rt kernel it waits at infinitum.
> >
> > Will play with it and see what I broke...
>
> Paul,
>
> you are not the culprit :)

Woo-hoo!!! Exonerated!!! This time, anyway... ;-)

> The run_posix_cpu_timers(p) call is #ifdef'd out with PREEMPT_RT.
>
> Thats a hard to fix issue.
>
> It can not be run from hardirq context, as it takes a lot of locks
> (especially our favorites: tasklist_lock and sighand->siglock). :(
>
> Maybe another playground for rcu, but it might also be solved by some
> other mechanism for accounting and delayed execution in the PREEMPT_RT
> case.

Certainly check_thread_timers() and check_process_timers() are playing
with a number of task_struct fields, so it is not immediately clear
to me how to safely replace tasklist_lock with RCU, at least not with
a simple and small patch.

What did you have in mind for delayed execution?

Thanx, Paul

2005-09-30 06:16:52

by Thomas Gleixner

[permalink] [raw]
Subject: Re: 2.6.13-rc6-rt9

On Thu, 2005-09-29 at 18:46 -0700, Paul E. McKenney wrote:
> > you are not the culprit :)
>
> Woo-hoo!!! Exonerated!!! This time, anyway... ;-)

My pleasure :)


> > It can not be run from hardirq context, as it takes a lot of locks
> > (especially our favorites: tasklist_lock and sighand->siglock). :(
> >
> > Maybe another playground for rcu, but it might also be solved by some
> > other mechanism for accounting and delayed execution in the PREEMPT_RT
> > case.
>
> Certainly check_thread_timers() and check_process_timers() are playing
> with a number of task_struct fields, so it is not immediately clear
> to me how to safely replace tasklist_lock with RCU, at least not with
> a simple and small patch.
>
> What did you have in mind for delayed execution?

Do only the time check in hard irq context and defer the lock protected
operations to a softirq context. Have to look deeper at the details
though.

tglx