LinuxLists.cc - Linux interrupt latency

2001-02-06 05:30:23

Subject: Linux interrupt latency

I'm working on the Linux driver for the Tormenta public domain dual T1
card (see http://www.bsdtelephony.com.mx). This card is a controllerless
ISA T1 card with no memory, meaning the host CPU must load the next 48
outgoing bytes and read the previous 48 incoming bytes off the ISA bus
8000 times per second (every 125 microseconds). Further, because the
buffers are constantly being overwritten by the card, the actual interrupt
handler must run within 4-28 microseconds from when the card issues the
interrupt.

On my primary test machine, a Pentium II, 450Mhz, with Intel 430BX
chipset, the board runs fine with both IDE and SCSI drives (note: DMA must
be turned on for the IDE drives). However, on other chipsets, like VIA,
the card misses 2-3 interrupts every 7989-7991 samples (almost exactly*
one second). Further, even with DMA turned on, the IDE disk definitely
kills the interrupt latency entirely.

Oh, and as a side note... The card works flawlessly in FreeBSD (although
only with SCSI) and definitely does not have 7989 sample problem. The
problem occurs with both Linux 2.2.16 and 2.4.0...

Can anyone suggest what might be causing the problem on non-Intel
chipsets, particularly what event might be occuring once per second and
disabling interrupts for a couple of hundred microseconds? Thanks!

Mark

2001-02-06 06:26:46

by Linus Torvalds

[permalink] [raw]

Subject: Re: Linux interrupt latency

In article <Pine.LNX.4.21.0102052304170.13906-100000@hoochie.linux-support.net>,
Mark Spencer <[email protected]> wrote:
>I'm working on the Linux driver for the Tormenta public domain dual T1
>card (see http://www.bsdtelephony.com.mx).

Hmm.. Sounds like somebody has designed a truly crappy card. Everything
is allowable in the name of being cheap, I guess ;)

> Further, because the
>buffers are constantly being overwritten by the card, the actual interrupt
>handler must run within 4-28 microseconds from when the card issues the
>interrupt.

You do know that doing even just _one_ ISA IO access takes about a
microsecond? The above sounds like somebody designed the card a bit too
close to the specs - things like interrupt ACK overhead etc is taking up
an uncomfortably big slice of your available latency.

>On my primary test machine, a Pentium II, 450Mhz, with Intel 430BX
>chipset, the board runs fine with both IDE and SCSI drives (note: DMA must
>be turned on for the IDE drives). However, on other chipsets, like VIA,
>the card misses 2-3 interrupts every 7989-7991 samples (almost exactly*
>one second). Further, even with DMA turned on, the IDE disk definitely
>kills the interrupt latency entirely.

If you can tell when it misses the interrupt (ie if the card has some
"overrun" bit or something), a simple profile might be useful. Just
make the interrupt handler save away an array of eip's when the overrun
happens, and they'll almost certainly point to something interesting (or
rather, they shoul dpoint to just _after_ something interesting). You
can get the eip by just looking it up from the pt_regs->eip that you get
in your interrupt handler.

>Can anyone suggest what might be causing the problem on non-Intel
>chipsets, particularly what event might be occuring once per second and
>disabling interrupts for a couple of hundred microseconds? Thanks!

Hmm.. The only thing that I can think of happening once a second is the
second overflow thing and the associated NTP maintenance, but that's
very lightweight. There might be some user-mode interaction, of course,
with people waking up or something - does it also happen in single-user
mode?

The non-intel chipset issue might just be due to timing being marginal
together with slow interrupt controllers - if you compile for an
old-style interrupt controller, interrupt handling will dp a _minimum_
of 5 IO cycles to th einterrupt controller. If the interrupt controller
has ISA timings, that will take 5 usecs rigth there. I _think_ the Intel
chipsets actually have the irq controller on the PCI side.

You can lower interrupt latency by either using the APIC, or if you are
using the i8259 you can edit arch/i386/kernel/i8259.c and search for
DUMMY and remove those two lines. It should avoid _one_ expensive IO
cycle, and considering your constraints it might be worth it.

It would not be hard to make "fast" ISA interrupts (that only ACK after
the fact and thus do not need to mask themselves off - instead of using
5 IO cycles you could do it with one or two depending on whether it's
from the primary or secondary controller) these days with the current
interrupt controller layer, but quite frankly nobody has bothered. It
sounds like you might want to look into this, though.

But try to see if you can get a profile of what it is that leads up to
the problem first..

Linus

2001-02-06 14:04:04

by Steve Underwood

[permalink] [raw]

Subject: Re: Linux interrupt latency

Linus Torvalds wrote:
>
> In article <Pine.LNX.4.21.0102052304170.13906-100000@hoochie.linux-support.net>,
> Mark Spencer <[email protected]> wrote:
> >I'm working on the Linux driver for the Tormenta public domain dual T1
> >card (see http://www.bsdtelephony.com.mx).
>
> Hmm.. Sounds like somebody has designed a truly crappy card. Everything
> is allowable in the name of being cheap, I guess ;)

Harsh words. Although it has some limitations, there is rhyme and reason
in the design of this card. It is fairly cheap to assemble. It uses all
through hole parts, so anyone can assemble it. It is a form of
open-source hardware, so anyone is welcome to assemble it. These days
the desire for low volume hand assembly can really tie your hands, so to
speak. Since the interrupt service needs to read a set of audio samples,
conference between them, and output the result within one sample period
interrupt latency is a critical issue. All the grunt takes place in the
interrupt routine, to avoid imposing significant real-time constraints
on the user space code.

> > Further, because the
> >buffers are constantly being overwritten by the card, the actual interrupt
> >handler must run within 4-28 microseconds from when the card issues the
> >interrupt.

Where does the 4-28 come from? The critical issue with interrupt timing
on this card is that a write from the ISA bus concurrent to a read
within the Mitel chip that does serial/parallel conversion causes
conflict. This is probably where the problem lies - see later.

> You do know that doing even just _one_ ISA IO access takes about a
> microsecond? The above sounds like somebody designed the card a bit too
> close to the specs - things like interrupt ACK overhead etc is taking up
> an uncomfortably big slice of your available latency.
>
> >On my primary test machine, a Pentium II, 450Mhz, with Intel 430BX
> >chipset, the board runs fine with both IDE and SCSI drives (note: DMA must
> >be turned on for the IDE drives). However, on other chipsets, like VIA,
> >the card misses 2-3 interrupts every 7989-7991 samples (almost exactly*
> >one second). Further, even with DMA turned on, the IDE disk definitely
> >kills the interrupt latency entirely.

A 450MHz PII processor isn't really up to the task. Jim Dixon, who
designed the card, recommends at least a 733MHz PIII, based on his
experience with the original BSD driver. I am able to run one as a dual
E1 card (therefore with 1/3 more workload that T1 mode), with Linux
2.2.16, on a 700MHz Athlon + VIA KX133 chip set, without slips - even
when I do some significant disk I/O. The large number of I/O cycles on
the ISA bus stalls the processor for about 1/3 of the time for a dual T1
card, and about 1/2 the time for a dual E1 card. This really sucks, and
needs to be addressed when the more serious long term PCI version of the
card goes through.

The FreeBSD and Linux drivers can both keep up, in suitable computers.
However, the write/read conflict issue I mentioned above requires some
very dirty dealings within the driver - look for a nasty little sequence
jumbling table in the driver source. You will see it only really just
about works, and seriously needs a comprehensive fix in the PCI version
of the card.

A brand new ISA card design, demanding a high speed CPU should tell you
this isn't a long term solution, regardless of performance. Treat this
card more as a proof of concept.

> If you can tell when it misses the interrupt (ie if the card has some
> "overrun" bit or something), a simple profile might be useful. Just
> make the interrupt handler save away an array of eip's when the overrun
> happens, and they'll almost certainly point to something interesting (or
> rather, they shoul dpoint to just _after_ something interesting). You
> can get the eip by just looking it up from the pt_regs->eip that you get
> in your interrupt handler.
>
> >Can anyone suggest what might be causing the problem on non-Intel
> >chipsets, particularly what event might be occuring once per second and
> >disabling interrupts for a couple of hundred microseconds? Thanks!

I think you are trying one board that is close to the limit, and one
just beyond. Simple as that.

> Hmm.. The only thing that I can think of happening once a second is the
> second overflow thing and the associated NTP maintenance, but that's
> very lightweight. There might be some user-mode interaction, of course,
> with people waking up or something - does it also happen in single-user
> mode?

There is some once per second activity in the driver, but not much. Its
unlikely to be the cause of the trouble. If things are really close to
the limit, I guess it might just be the straw that breaks the taxman's
back (OK, I like camels more than taxmen).

> The non-intel chipset issue might just be due to timing being marginal
> together with slow interrupt controllers - if you compile for an
> old-style interrupt controller, interrupt handling will dp a _minimum_
> of 5 IO cycles to th einterrupt controller. If the interrupt controller
> has ISA timings, that will take 5 usecs rigth there. I _think_ the Intel
> chipsets actually have the irq controller on the PCI side.

Most likely.

> You can lower interrupt latency by either using the APIC, or if you are
> using the i8259 you can edit arch/i386/kernel/i8259.c and search for
> DUMMY and remove those two lines. It should avoid _one_ expensive IO
> cycle, and considering your constraints it might be worth it.

Good idea. I'll have to try that. There always seem to be lots of
problem reports about IOAPIC, though. Should I trust it?

> It would not be hard to make "fast" ISA interrupts (that only ACK after
> the fact and thus do not need to mask themselves off - instead of using
> 5 IO cycles you could do it with one or two depending on whether it's
> from the primary or secondary controller) these days with the current
> interrupt controller layer, but quite frankly nobody has bothered. It
> sounds like you might want to look into this, though.
>
> But try to see if you can get a profile of what it is that leads up to
> the problem first..
>
> Linus

Regards,
Steve

2001-02-06 14:15:57

by Alan

[permalink] [raw]

Subject: Re: Linux interrupt latency

> A 450MHz PII processor isn't really up to the task. Jim Dixon, who
> designed the card, recommends at least a 733MHz PIII, based on his
> experience with the original BSD driver. I am able to run one as a dual

It shouldnt matter. If its entirely tied to ISA bus performance you should
be able to run it on 486 quite honestly.

> card, and about 1/2 the time for a dual E1 card. This really sucks, and
> needs to be addressed when the more serious long term PCI version of the
> card goes through.

or hang a small CPU on it and shove it on USB since USB can do isosynchronous

> > >chipsets, particularly what event might be occuring once per second and
> > >disabling interrupts for a couple of hundred microseconds? Thanks!
>
> I think you are trying one board that is close to the limit, and one
> just beyond. Simple as that.

And quite a few chipsets steal cycles for other things (memory refresh, ram
thermal limiting and the like). With the Z85230 driver which also at high
speed really tests ISA bus throughput I regularly saw PCI boxes getting
worse performance than ancient all ISA relics. Throughput was also comparable
between a K5 and a PII/233

> Good idea. I'll have to try that. There always seem to be lots of
> problem reports about IOAPIC, though. Should I trust it?

The problems with the IO apic are generally bios, however there is one
concern I have. APIC is a message passing bus so doesnt have a guaranteed
delivery time.

2001-02-06 14:24:26

by Andrew Morton

[permalink] [raw]

Subject: Re: Linux interrupt latency

Mark Spencer wrote:
>
> Can anyone suggest what might be causing the problem on non-Intel
> chipsets, particularly what event might be occuring once per second and
> disabling interrupts for a couple of hundred microseconds? Thanks!

I have a gizmo which will find this for you.

http://www.uow.edu.au/~andrewm/linux/#timepegs

- Grab the 2.4.1-pre10 patch and tpt.
- Apply patch. Under `Kernel hacking', enable timepegs
and `Interrupt latency'. Make sure you enable IO-APIC
on Uniprocessor. Rebuild kernel. Reboot.
- Run `tpt' to zero all the counters.
- Use the system for a few minutes in a normal manner
- Run `tpt -s | sort -nr +5'

You'll get something like this:

do_IRQ.in:0 -> softirq.c:71 6059 7.14 18.56 8.45 51237.71
do_IRQ.in:0 -> do_IRQ.out:0 255 11.60 16.69 13.47 3437.13
irq.c:476 -> irq.c:481 1 14.31 14.31 14.31 14.31
exit.c:384 -> exit.c:418 4 5.94 10.53 8.40 33.63
ll_rw_blk.c:759 -> ll_rw_blk.c:856 3709 .64 8.96 1.01 3754.22
3c59x.c:1835 -> 3c59x.c:1855 81405 2.53 8.63 2.92 238321.05
ide.c:513 -> ide.c:522 3709 .52 8.10 1.42 5300.37
irq.c:523 -> irq.c:542 1 7.14 7.14 7.14 7.14
signal.c:528 -> signal.c:546 13 1.95 6.10 4.22 55.00
page_alloc.c:181 -> page_alloc.c:198 4407 .41 5.86 .75 3355.55
sched.c:541 -> sched.c:596 8238 .35 5.22 1.22 10055.84
skbuff.c:121 -> skbuff.c:123 206065 .32 5.03 .36 74765.68
signal.c:602 -> signal.c:604 17 1.28 5.03 2.91 49.63
dev.c:1127 -> dev.c:1139 43268 .34 4.86 .41 17836.39
sched.c:713 -> sched.c:748 5497 .31 4.78 1.10 6055.92
ll_rw_blk.c:377 -> ide.c:1357 78 .93 4.63 3.36 262.83
timer.c:205 -> timer.c:209 43465 .33 4.61 .43 19044.11
slab.c:1298 -> slab.c:1322 129153 .32 4.59 .39 51402.35

So the worst interrupt latency path on this machine was 18 usecs,
from the entry into do_IRQ to the enabling of interrupts in do_softirq.
Traversed 6059 times, min,max,avg=7,18,8 usecs. Aggregate irq
blockage 51 msecs.