2005-11-12 15:54:59

by Pozsar Balazs

[permalink] [raw]
Subject: start_kernel / local_irq_enable() can be very slow

Hi all,

I've noticed that the local_irq_enable() call in
init/main.c::start_kernel() takes 0.3 .. 3.0 seconds on every boot.
(Measured using printk's around it.)

I am wondering what happens during this call (which is effectively one
"sti"), why does it take so much time sometimes, and why does it vary so
much, why isn't it (more) deterministic.


This is a x86 machine, I can give more details if needed.
I've observed this behaviour on a few other machines too.


Anybody any clues?
thanks,

--
pozsy


2005-11-14 07:48:27

by Andrew Morton

[permalink] [raw]
Subject: Re: start_kernel / local_irq_enable() can be very slow

Pozsar Balazs <[email protected]> wrote:
>
> I've noticed that the local_irq_enable() call in
> init/main.c::start_kernel() takes 0.3 .. 3.0 seconds on every boot.
> (Measured using printk's around it.)
>
> I am wondering what happens during this call (which is effectively one
> "sti"), why does it take so much time sometimes, and why does it vary so
> much, why isn't it (more) deterministic.

Presumably as soon as it does the sti, the CPU takes one or more interrupts
and runs off and does something.

I wonder what?

You could do something like:

int trace_irqs;

trace_irqs = 1;
local_irq_enble();
trace_irqs = 0;

then, over in handle_IRQ_event():

if (trace_irqs)
print_symbol("calling %s\n", (unsigned long)action->handler);


2005-11-14 15:42:28

by Linus Torvalds

[permalink] [raw]
Subject: Re: start_kernel / local_irq_enable() can be very slow



On Sun, 13 Nov 2005, Andrew Morton wrote:
>
> You could do something like:
>
> int trace_irqs;
>
> trace_irqs = 1;
> local_irq_enble();
> trace_irqs = 0;

Do "trace_irqs = 10" first.

> then, over in handle_IRQ_event():
>
> if (trace_irqs)
> print_symbol("calling %s\n", (unsigned long)action->handler);

And decrement it here somewhere.

If it's delayed by up to three seconds, it sounds like there's a _lot_ of
interrupts happening, and I don't think there's any point in showing all
of them.

Linus

2005-11-14 15:55:43

by Pozsar Balazs

[permalink] [raw]
Subject: Re: start_kernel / local_irq_enable() can be very slow


> > You could do something like:
> >
> > int trace_irqs;
> >
> > trace_irqs = 1;
> > local_irq_enble();
> > trace_irqs = 0;
>
> Do "trace_irqs = 10" first.
>
> > then, over in handle_IRQ_event():
> >
> > if (trace_irqs)
> > print_symbol("calling %s\n", (unsigned long)action->handler);
>
> And decrement it here somewhere.
>
> If it's delayed by up to three seconds, it sounds like there's a _lot_ of
> interrupts happening, and I don't think there's any point in showing all
> of them.

Well, after using tons of printk's I found that the following is
happening:
- irqs get enabled, and the timer interrupt gets handled,
- arch/i386/kernel/timer.c::timer_interrupt gets called,
- which in turn calls cur_timer->mark_offset(), which is
- arch/i386/kernel/timers/timer_pm.c::mark_offset_pmtmr(), where we
finally arrive to:
/* compensate for lost ticks */
if (lost >= 2)
jiffies_64 += lost - 1;

So no irq storm here, everything works as it should, the jiffies gets
updated as it should, that's why it _seems_ from printk's that it takes
long. (And only 1 irq happens.)


The main question though remains: what takes up seconds on the kernel
boot, and why is it so undeterministic how long it takes (sometimes just
0.3s, sometimes even 3s)? (Something before timers get enabled.)


--
pozsy