2003-03-10 22:12:01

by uaca

[permalink] [raw]
Subject: IRQ affinity good for lowering interrupt latency?


Hi all

I have measured interrupt latency of a bi-processor system with akpm's intlat/timepeg
utilities and kernel 2.4.20. The system uses is a two-way PIII@800Mhz
-- Intel motherboard, ISP 2100 if I remember ok, with a SCSI disk on a aic-7896

I tried to know if I could reduce the latency of an interrupt handler by
binding this handler to a particular cpu and not allowing any other
interrupt to execute in that cpu.

I found that overall latency increases, but also the latency on both cpus,
that is, I could not reduce the latency on the interrupt I was interested

I also tried to disallowing just scsi & ethernet handlers to execute on the
cpu in wich I'm binding the interrupt handler I'm insterested with, I get
similar results

the interrupt handler I'm interested with is an ATM card receiving
around 6000 interrupts/second

any comment will be greatly appreciated

Thanks in advance


I attach a sample table

Ulisses

Debian GNU/Linux: a dream come true
-----------------------------------------------------------------------------
"Computers are useless. They can only give answers." Pablo Picasso

---> Visita http://www.valux.org/ para saber acerca de la <---
---> Asociaci?n Valenciana de Usuarios de Linux <---

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

bound-* -> binding irq's to specific cpu's
not_bound-* -> default irq balancing
idle-* -> no extra load, just receiving interrupts, and using irq balancing
(not tcpdumping from the device)

*-round? -> same test, just another sample

The following loads were added:

*/cpu_io_load* -> adding a process load, and another process doing intensive I/O (write) +tcpdump
*/cpu_load* -> process load +tcpdump
*/io_load* -> I/O load +tcpdump
*/no_load* -> just tcpdump (always running on the ATM card)

time unit is microseconds

<config name> <cpu0> <cpu1> <both cpu> <running time>
bound-round1/cpu_io_load.conf-capturando-atm 72721403.12 32011026.4 104732430.03 334000000
bound-round1/cpu_load.conf-capturando-atm 17330835.03 22209536.73 39540372.14 301000000
bound-round1/io_load.conf-capturando-atm 26490926.85 35473241.13 61964168.45 332000000
bound-round1/no_load.conf-capturando-atm 17723058.55 24066995.44 41790054.28 300000000
bound-round2/cpu_io_load.conf-capturando-atm 70139325.9899999 31673560.18 101812886.64 332000000
bound-round2/cpu_load.conf-capturando-atm 20834850.44 22103680.54 42938531.31 300000000
bound-round2/io_load.conf-capturando-atm 25995860.86 35030250.99 61026112.51 331000000
bound-round2/no_load.conf-capturando-atm 16926317.67 22890716.86 39817034.87 300000000
not_bound-round1/cpu_io_load.conf-capturando-atm 27857114.18 30214652.36 58071767.23 330000000
not_bound-round1/cpu_load.conf-capturando-atm 17861389.71 17826837.68 35688227.83 300000000
not_bound-round1/io_load.conf-capturando-atm 28921009.86 30368027.35 59289037.74 329000000
not_bound-round1/no_load.conf-capturando-atm 17095881.6 17804144.4 34900026.4 301000000
not_bound-round2/cpu_io_load.conf-capturando-atm 28958809.19 29576235.94 58535045.78 333000000
not_bound-round2/cpu_load.conf-capturando-atm 17654701.15 19004883.19 36659584.82 301000000
not_bound-round2/io_load.conf-capturando-atm 30383447.95 31634551.35 62017999.99 332000000
not_bound-round2/no_load.conf-capturando-atm 18103003.63 17573932.55 35676936.69 300000000
idle-round1/no_load.conf-idle 0 0 16030718.21 300000000
idle-round2/no_load.conf-idle 0 0 16106698.44 300000000


2003-03-11 12:08:38

by uaca

[permalink] [raw]
Subject: is irq smp affinity good for anything?



if IRQ affinity cannot help in interrupt latency and
interrupt auto/balancing get's the same throughtput
as irq binding... what is irq smp affinity for???

from this mail from intel
http://www.uwsg.iu.edu/hypermail/linux/kernel/0301.0/1886.html
I understand that intel's interrupt auto/balancing works equal that manually
tunning the interrupts....

so?

Ulisses



On Mon, Mar 10, 2003 at 11:22:22PM +0100, [email protected] wrote:
>
> Hi all
>
> I have measured interrupt latency of a bi-processor system with akpm's intlat/timepeg
> utilities and kernel 2.4.20. The system uses is a two-way PIII@800Mhz
> -- Intel motherboard, ISP 2100 if I remember ok, with a SCSI disk on a aic-7896
>
> I tried to know if I could reduce the latency of an interrupt handler by
> binding this handler to a particular cpu and not allowing any other
> interrupt to execute in that cpu.
>
> I found that overall latency increases, but also the latency on both cpus,
> that is, I could not reduce the latency on the interrupt I was interested
>
> I also tried to disallowing just scsi & ethernet handlers to execute on the
> cpu in wich I'm binding the interrupt handler I'm insterested with, I get
> similar results
>
> the interrupt handler I'm interested with is an ATM card receiving
> around 6000 interrupts/second
>
> any comment will be greatly appreciated
>
> Thanks in advance
>
>
> I attach a sample table
>
> Ulisses
>
> Debian GNU/Linux: a dream come true
> -----------------------------------------------------------------------------
> "Computers are useless. They can only give answers." Pablo Picasso
>
> ---> Visita http://www.valux.org/ para saber acerca de la <---
> ---> Asociaci?n Valenciana de Usuarios de Linux <---
>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>
> bound-* -> binding irq's to specific cpu's
> not_bound-* -> default irq balancing
> idle-* -> no extra load, just receiving interrupts, and using irq balancing
> (not tcpdumping from the device)
>
> *-round? -> same test, just another sample
>
> The following loads were added:
>
> */cpu_io_load* -> adding a process load, and another process doing intensive I/O (write) +tcpdump
> */cpu_load* -> process load +tcpdump
> */io_load* -> I/O load +tcpdump
> */no_load* -> just tcpdump (always running on the ATM card)
>
> time unit is microseconds
>
> <config name> <cpu0> <cpu1> <both cpu> <running time>
> bound-round1/cpu_io_load.conf-capturando-atm 72721403.12 32011026.4 104732430.03 334000000
> bound-round1/cpu_load.conf-capturando-atm 17330835.03 22209536.73 39540372.14 301000000
> bound-round1/io_load.conf-capturando-atm 26490926.85 35473241.13 61964168.45 332000000
> bound-round1/no_load.conf-capturando-atm 17723058.55 24066995.44 41790054.28 300000000
> bound-round2/cpu_io_load.conf-capturando-atm 70139325.9899999 31673560.18 101812886.64 332000000
> bound-round2/cpu_load.conf-capturando-atm 20834850.44 22103680.54 42938531.31 300000000
> bound-round2/io_load.conf-capturando-atm 25995860.86 35030250.99 61026112.51 331000000
> bound-round2/no_load.conf-capturando-atm 16926317.67 22890716.86 39817034.87 300000000
> not_bound-round1/cpu_io_load.conf-capturando-atm 27857114.18 30214652.36 58071767.23 330000000
> not_bound-round1/cpu_load.conf-capturando-atm 17861389.71 17826837.68 35688227.83 300000000
> not_bound-round1/io_load.conf-capturando-atm 28921009.86 30368027.35 59289037.74 329000000
> not_bound-round1/no_load.conf-capturando-atm 17095881.6 17804144.4 34900026.4 301000000
> not_bound-round2/cpu_io_load.conf-capturando-atm 28958809.19 29576235.94 58535045.78 333000000
> not_bound-round2/cpu_load.conf-capturando-atm 17654701.15 19004883.19 36659584.82 301000000
> not_bound-round2/io_load.conf-capturando-atm 30383447.95 31634551.35 62017999.99 332000000
> not_bound-round2/no_load.conf-capturando-atm 18103003.63 17573932.55 35676936.69 300000000
> idle-round1/no_load.conf-idle 0 0 16030718.21 300000000
> idle-round2/no_load.conf-idle 0 0 16106698.44 300000000
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
Debian GNU/Linux: a dream come true
-----------------------------------------------------------------------------
"Computers are useless. They can only give answers." Pablo Picasso

---> Visita http://www.valux.org/ para saber acerca de la <---
---> Asociaci?n Valenciana de Usuarios de Linux <---

2003-03-11 12:28:18

by Richard B. Johnson

[permalink] [raw]
Subject: Re: is irq smp affinity good for anything?

On Tue, 11 Mar 2003 [email protected] wrote:

>
>
> if IRQ affinity cannot help in interrupt latency and
> interrupt auto/balancing get's the same throughtput
> as irq binding... what is irq smp affinity for???
>
> from this mail from intel
> http://www.uwsg.iu.edu/hypermail/linux/kernel/0301.0/1886.html
> I understand that intel's interrupt auto/balancing works equal that manually
> tunning the interrupts....
>
> so?
>
> Ulisses
>
>
>
> On Mon, Mar 10, 2003 at 11:22:22PM +0100, [email protected] wrote:
> >
> > Hi all
> >
> > I have measured interrupt latency of a bi-processor system with akpm's intlat/timepeg
> > utilities and kernel 2.4.20. The system uses is a two-way PIII@800Mhz
> > -- Intel motherboard, ISP 2100 if I remember ok, with a SCSI disk on a aic-7896
> >
> > I tried to know if I could reduce the latency of an interrupt handler by
> > binding this handler to a particular cpu and not allowing any other
> > interrupt to execute in that cpu.
> >
> > I found that overall latency increases, but also the latency on both cpus,
> > that is, I could not reduce the latency on the interrupt I was interested
> >
> > I also tried to disallowing just scsi & ethernet handlers to execute on the
> > cpu in wich I'm binding the interrupt handler I'm insterested with, I get
> > similar results
> >
> > the interrupt handler I'm interested with is an ATM card receiving
> > around 6000 interrupts/second
> >
[SNIPPED...]

33 MHz machines easily handle 6,000 interrupts per second --
unless you are trying to execute code within that interrupt
that requires 1/6000th of a second to execute! Perhaps it's
not a "latency" problem, but an interrupt code-bloat problem
where most of the stuff should be executed out of the interrupt
context.

There are timer queues and kernel threads available that might
help you. Also a tightly-coupled user-mode daemon that uses
your driver only to interface with the hardware and not do
any "logic" is probably the way to go.

Cheers,
Dick Johnson
Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.


2003-03-11 13:54:28

by uaca

[permalink] [raw]
Subject: Re: is irq smp affinity good for anything?

On Tue, Mar 11, 2003 at 07:40:15AM -0500, Richard B. Johnson wrote:

Hi Richard, thanks so much for your reply

> On Tue, 11 Mar 2003 [email protected] wrote:
[...]
>
> 33 MHz machines easily handle 6,000 interrupts per second --
> unless you are trying to execute code within that interrupt
> that requires 1/6000th of a second to execute! Perhaps it's
> not a "latency" problem, but an interrupt code-bloat problem
> where most of the stuff should be executed out of the interrupt
> context.

it seems I explained too fuzzy/I had to explain it better

what I wanted to try is avoidoiding irq latency paths in the CPU where is
executing the ISR, where I'm interested not delaying time stamps by any other
means.

And yes... maybe I'm a little paranoid about this, but doing an
echo <something> > /proc/irqs/[0-9]*/smp_affinity is cheap
and it's supossed? I should get better results... or not?

anyway...

I did not expect to increase global latency to these results...
and neither to increase latency in the CPU that's receiving
just one interrupt!

Ulisses

PD: I'm not doing a driver, just measuring

Debian GNU/Linux: a dream come true
-----------------------------------------------------------------------------
"Computers are useless. They can only give answers." Pablo Picasso

---> Visita http://www.valux.org/ para saber acerca de la <---
---> Asociaci?n Valenciana de Usuarios de Linux <---

2003-03-12 10:00:37

by uaca

[permalink] [raw]
Subject: Re: is irq smp affinity good for anything?

On Tue, Mar 11, 2003 at 08:48:59PM -0500, Mark Hahn wrote:
> > I did not expect to increase global latency to these results...
> > and neither to increase latency in the CPU that's receiving
> > just one interrupt!
>
> but isn't that just a cache effect? that is, you're keeping
> all cpus busy (caches too) with user-space, so when the interrupt
> comes in, a bound interrupt has no choice, even if the cache
> is busy with userspace.
Hi

first of all thanks for your reply,

I think that user space code always has to make the best use of cache as it
can... in other words, i don't want to use a cpu exclusively for a device
that delivers 6000 ints/second

I bound an irq to a cpu because I thought that:

as spin_irq_locks just disables interrupts locally I should get better
latency that just one ISR on that particular cpu could at least reduce
a little the number of times that interrupts get disabled on that cpu

... that was my reasoning...

but latency gets worse... that's not comphrensible for me...

Ulisses


Debian GNU/Linux: a dream come true
-----------------------------------------------------------------------------
"Computers are useless. They can only give answers." Pablo Picasso

---> Visita http://www.valux.org/ para saber acerca de la <---
---> Asociaci?n Valenciana de Usuarios de Linux <---

2003-03-12 15:21:39

by uaca

[permalink] [raw]
Subject: Re: is irq smp affinity good for anything?


Hi Mark

thanks again for your reply

On Wed, Mar 12, 2003 at 09:00:13AM -0500, Mark Hahn wrote:
> > I think that user space code always has to make the best use of cache as it
> > can... in other words, i don't want to use a cpu exclusively for a device
> > that delivers 6000 ints/second
>
> right, 6K is trivial.
>
> > as spin_irq_locks just disables interrupts locally I should get better
> > latency that just one ISR on that particular cpu could at least reduce
> > a little the number of times that interrupts get disabled on that cpu
> >
> > ... that was my reasoning...
>
> but disabling irq's is not a really heavy operation, especially
> at only 6KHz.

yes... I assumead It could be noticiable...

> > but latency gets worse... that's not comphrensible for me...
>
> if the machine is unloaded with cache-polluting user-space tasks,
> what's the latency?


in brief: 16 seconds of accumulated latency from 300 seconds of running time

no any user space load, just receiving and ATM traffic at 6000 interrupts/second

you can see my first post to lkml: in this case see the idle*/* results

Ulisses

PD: just for fun: is it possible to change irq priorities on Linux+IO-APIC?

---------------------------------------------------------------------------------------------



bound-* -> binding irq's to specific cpu's
not_bound-* -> default irq balancing
idle-* -> no extra load, just receiving interrupts, and using irq balancing
(not tcpdumping from the device)

*-round? -> same test, just another sample

The following loads were added:

*/cpu_io_load* -> adding a process load, and another process doing intensive I/O (write) +tcpdump
*/cpu_load* -> process load +tcpdump
*/io_load* -> I/O load +tcpdump
*/no_load* -> just tcpdump (always running on the ATM card)

time unit is microseconds

<config name> <cpu0> <cpu1> <both cpu> <running time>
bound-round1/cpu_io_load.conf-capturando-atm 72721403.12 32011026.4 104732430.03 334000000
bound-round1/cpu_load.conf-capturando-atm 17330835.03 22209536.73 39540372.14 301000000
bound-round1/io_load.conf-capturando-atm 26490926.85 35473241.13 61964168.45 332000000
bound-round1/no_load.conf-capturando-atm 17723058.55 24066995.44 41790054.28 300000000
bound-round2/cpu_io_load.conf-capturando-atm 70139325.9899999 31673560.18 101812886.64 332000000
bound-round2/cpu_load.conf-capturando-atm 20834850.44 22103680.54 42938531.31 300000000
bound-round2/io_load.conf-capturando-atm 25995860.86 35030250.99 61026112.51 331000000
bound-round2/no_load.conf-capturando-atm 16926317.67 22890716.86 39817034.87 300000000
not_bound-round1/cpu_io_load.conf-capturando-atm 27857114.18 30214652.36 58071767.23 330000000
not_bound-round1/cpu_load.conf-capturando-atm 17861389.71 17826837.68 35688227.83 300000000
not_bound-round1/io_load.conf-capturando-atm 28921009.86 30368027.35 59289037.74 329000000
not_bound-round1/no_load.conf-capturando-atm 17095881.6 17804144.4 34900026.4 301000000
not_bound-round2/cpu_io_load.conf-capturando-atm 28958809.19 29576235.94 58535045.78 333000000
not_bound-round2/cpu_load.conf-capturando-atm 17654701.15 19004883.19 36659584.82 301000000
not_bound-round2/io_load.conf-capturando-atm 30383447.95 31634551.35 62017999.99 332000000
not_bound-round2/no_load.conf-capturando-atm 18103003.63 17573932.55 35676936.69 300000000
idle-round1/no_load.conf-idle 0 0 16030718.21 300000000
idle-round2/no_load.conf-idle 0 0 16106698.44 300000000

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>


Debian GNU/Linux: a dream come true
-----------------------------------------------------------------------------
"Computers are useless. They can only give answers." Pablo Picasso

---> Visita http://www.valux.org/ para saber acerca de la <---
---> Asociaci?n Valenciana de Usuarios de Linux <---

2003-03-13 15:36:52

by uaca

[permalink] [raw]
Subject: Re: is irq smp affinity good for anything?

On Thu, Mar 13, 2003 at 07:17:37AM -0500, Mark Hounschell wrote:
[...]
> If you also bind your task to the same cpu and force all other tasks from that
> cpu while doing the same with the irq, your determinism will improve greatly.
> Determinism being the difference in the best and worse case latencies. The
> smaller the better (jitter). This won't increase a single latency time but your
> determinism will be greatly improved.

Thanks so much for your comments

Yes... maybe there is also cache pingpong because common locks are in
different cpus... I'will try it

do you know what's the best/less intrusive patch that allows
task cpu binding?

Thanks in advance again :-)

regards

Ulisses

Debian GNU/Linux: a dream come true
-----------------------------------------------------------------------------
"Computers are useless. They can only give answers." Pablo Picasso

---> Visita http://www.valux.org/ para saber acerca de la <---
---> Asociaci?n Valenciana de Usuarios de Linux <---

2003-03-13 17:43:59

by Mark Hounschell

[permalink] [raw]
Subject: Re: is irq smp affinity good for anything?

[email protected] wrote:
>
> Thanks so much for your comments
>
> Yes... maybe there is also cache pingpong because common locks are in
> different cpus... I'will try it
>
> do you know what's the best/less intrusive patch that allows
> task cpu binding?
<---
Probably the least intrusive can be gotten here:

http://www.kernel.org/pub/linux/kernel/people/rml/cpu-affinity/

Or the O(1) schedular patches. That is a little more intrusive though.

Mark