X-Status: S
Dear list,
What latency should I expect for hardware interrupts under k2.4 / i386 ?
ie how long should it take between the hardware signalling the interrupt and
the interrupt handler being called?
I am wrting a driver which pace IO with interrupts, generating one interrupt
for after every transfer is done. Looking at the hardware schematics the
interrupts should occur virtually instantly after each transfer but the
driver is waiting ~1ms/ interrupt.
I can use polling instead with busy waits but this seems a bit wasteful.
My interrupt is shared with my graphics card using the non-GPL nvidia driver -
could this be responsible for the delay (any experience with this)?
cat /proc/interrupts
.....
10: 3028 XT-PIC eth0, VIA 82C686A
11: 1117037 XT-PIC nvidia, PI stage <-- my driver
12: 14776 XT-PIC usb-uhci, usb-uhci
.....
Thanks SA
On Fri, 7 February 2003 18:50:52 +0000, SA wrote:
>
> What latency should I expect for hardware interrupts under k2.4 / i386 ?
>
>
> ie how long should it take between the hardware signalling the interrupt and
> the interrupt handler being called?
Upper limit should be ~10,000 clock cycles.
Doing some tests on ppc, some rtos was able to react within 200
cycles, linux took 1000 or so. Add whatever time you handler (the c
code) takes.
J?rn
--
When you close your hand, you own nothing. When you open it up, you
own the whole world.
-- Li Mu Bai in Tiger & Dragon
On Mon, 10 February 2003 09:58:46 -0500, Dan Parks wrote:
>
> Do you happen to have a program (or kernel module...) that times these
> latencies? I've been trying to generate statistics about these kinds of
> latencies, and have yet to be happy with any of my tests.
That should be impossible to do. :)
Write a simple handler for parport or so, that is called when line #1
toggles from low to high and responds by pulling line #2 from low to
high.
Now hook up a signal generator and an oszilloscope and measure the
time from signal generation to the physical reaction.
This way you get all the latency, not just the small amount you can
measure inside the kernel.
J?rn
--
Everything should be made as simple as possible, but not simpler.
-- Albert Einstein
On Mon, 10 Feb 2003, [iso-8859-1] J?rn Engel wrote:
> On Mon, 10 February 2003 09:58:46 -0500, Dan Parks wrote:
> >
> > Do you happen to have a program (or kernel module...) that times these
> > latencies? I've been trying to generate statistics about these kinds of
> > latencies, and have yet to be happy with any of my tests.
>
> That should be impossible to do. :)
>
> Write a simple handler for parport or so, that is called when line #1
> toggles from low to high and responds by pulling line #2 from low to
> high.
> Now hook up a signal generator and an oszilloscope and measure the
> time from signal generation to the physical reaction.
>
> This way you get all the latency, not just the small amount you can
> measure inside the kernel.
>
> J?rn
>
Yes, and you will find that you can replicate a square-wave, through
the hardware and software up to about 50 kHz with a 400 MHz Pentium
if you disconnect your network during the tests.
My tests, several years ago, in the ISR simply XOR-ed a saved
copy of bit zero with `1` to toggle it and wrote it out the
data port. This would occur at every IRQ7, generated by hitting
bit 2 of the control port with a function generator. This should
produce a symmetrical /2 when you look at bit 0. You can line up
the starting 'high' of the function generator, with either a high or
low of bit one (because it's /2) and measure the time, which from
my notebook looks like 1.2 to 1.4 microseconds on a 400MHz machine.
You can increase the interrupt rate until the machine is no longer
able to keep up. This usually occurs around 110 kHz or higher.
If you modify whatever Ethernet driver you are using to remove the
loop in its ISR, you can get good results with the network connected.
However, you will have poor results with any network driver that
contains a loop in its ISR.
Cheers,
Dick Johnson
Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.
On Mon, 10 February 2003 11:20:54 -0500, Richard B. Johnson wrote:
> On Mon, 10 Feb 2003, [iso-8859-1] J?rn Engel wrote:
>
> > Write a simple handler for parport or so, that is called when line #1
> > toggles from low to high and responds by pulling line #2 from low to
> > high.
> > Now hook up a signal generator and an oszilloscope and measure the
> > time from signal generation to the physical reaction.
>
> Yes, and you will find that you can replicate a square-wave, through
> the hardware and software up to about 50 kHz with a 400 MHz Pentium
> if you disconnect your network during the tests.
>
> My tests, several years ago, in the ISR simply XOR-ed a saved
> copy of bit zero with `1` to toggle it and wrote it out the
> data port. This would occur at every IRQ7, generated by hitting
> bit 2 of the control port with a function generator. This should
> produce a symmetrical /2 when you look at bit 0. You can line up
> the starting 'high' of the function generator, with either a high or
> low of bit one (because it's /2) and measure the time, which from
> my notebook looks like 1.2 to 1.4 microseconds on a 400MHz machine.
>
> You can increase the interrupt rate until the machine is no longer
> able to keep up. This usually occurs around 110 kHz or higher.
1.2us translates to 800kHz or 500 clock cycles. That is a good
response time.
110kHz max rate translates to 3500 clock cycles for the complete
interrupt path. This means that the return path takes six times longer
to complete than the initialisation path. Odd.
Somehow, I get this feeling that linux could still do better. 500
cycles is already better than the 2000 we observed, but most of that
should come from the different architecture.
An assembler interrupt handler that saves registers, tweaks a couple
of bits, restores registers and gets the hell out of here should be in
the order of 100 cycles, maybe less. Why is linux wasting all this
time?
J?rn
--
Simplicity is prerequisite for reliability.
-- Edsger W. Dijkstra
On Mon, 10 Feb 2003, [iso-8859-1] J?rn Engel wrote:
> On Mon, 10 February 2003 11:20:54 -0500, Richard B. Johnson wrote:
> > On Mon, 10 Feb 2003, [iso-8859-1] J?rn Engel wrote:
> >
> > > Write a simple handler for parport or so, that is called when line #1
> > > toggles from low to high and responds by pulling line #2 from low to
> > > high.
> > > Now hook up a signal generator and an oszilloscope and measure the
> > > time from signal generation to the physical reaction.
> >
> > Yes, and you will find that you can replicate a square-wave, through
> > the hardware and software up to about 50 kHz with a 400 MHz Pentium
> > if you disconnect your network during the tests.
> >
> > My tests, several years ago, in the ISR simply XOR-ed a saved
> > copy of bit zero with `1` to toggle it and wrote it out the
> > data port. This would occur at every IRQ7, generated by hitting
> > bit 2 of the control port with a function generator. This should
> > produce a symmetrical /2 when you look at bit 0. You can line up
> > the starting 'high' of the function generator, with either a high or
> > low of bit one (because it's /2) and measure the time, which from
> > my notebook looks like 1.2 to 1.4 microseconds on a 400MHz machine.
> >
> > You can increase the interrupt rate until the machine is no longer
> > able to keep up. This usually occurs around 110 kHz or higher.
>
> 1.2us translates to 800kHz or 500 clock cycles. That is a good
> response time.
> 110kHz max rate translates to 3500 clock cycles for the complete
> interrupt path. This means that the return path takes six times longer
> to complete than the initialisation path. Odd.
>
> Somehow, I get this feeling that linux could still do better. 500
> cycles is already better than the 2000 we observed, but most of that
> should come from the different architecture.
>
> An assembler interrupt handler that saves registers, tweaks a couple
> of bits, restores registers and gets the hell out of here should be in
> the order of 100 cycles, maybe less. Why is linux wasting all this
> time?
>
> J?rn
>
With any complete system, you need more than to save a few
registers. If you are to be able to use 'C' for the ISR,
you need to have a stack with the access registers (on ix86,
the segment registers) need to be initialized, the appropriate
interrupt context needs to occur, i.e., kernel mode. Further,
you are going to talk to the legacy interrupt controller(s) through
I/O ports that take about 300 ns per access. Also, to prevent
unexpected reentry some kernel code has to be executed and then
the module (driver) ISR called from kernel code. It is not something
as simple, or as broken as "void interrupt far ISR()" as DOS
thunkers are used to programming.
As previously stated, if you use IOAPIC and have a fast CPU, your
speed increases. All my work on characterizing the interrupt system
on Linux was based upon legacy I/O and a 400 MHz CPU. From that
work, I was able to find the major problems with so-called latency
and I was able to get a SC520 embedded system (33 MHz clock, 133 MHz
CPU core) to work with an average interrupt rate of over 15 kHz and
a peak exceeding 30 kHz. I did not have a disk-drive to worry about,
but I had Network I/O so I needed to modify the PCnet 32 driver
(AMD chip) so it didn't loop in the ISR. That fixed the only latency
problem I had.
FYI execution speed and interrupt latency doesn't scale well. You
are most always I/O bound somewhere.
Cheers,
Dick Johnson
Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.
On Mon, 10 February 2003 13:30:49 -0500, Richard B. Johnson wrote:
> On Mon, 10 Feb 2003, [iso-8859-1] J?rn Engel wrote:
>
> > An assembler interrupt handler that saves registers, tweaks a couple
> > of bits, restores registers and gets the hell out of here should be in
> > the order of 100 cycles, maybe less. Why is linux wasting all this
> > time?
>
> FYI execution speed and interrupt latency doesn't scale well. You
> are most always I/O bound somewhere.
Ack. It always takes a moment to get used to these facts.
J?rn
--
Beware of bugs in the above code; I have only proved it correct, but
not tried it.
-- Donald Knuth