2004-03-22 23:04:18

by Rusty Russell

[permalink] [raw]
Subject: arch/i386/Kconfig: CONFIG_IRQBALANCE Description

I came across this description:

config IRQBALANCE
bool "Enable kernel irq balancing"
depends on SMP && X86_IO_APIC
default y
help
The default yes will allow the kernel to do irq load balancing.
Saying no will keep the kernel from doing irq load balancing.

Holy shades of preempt! Help messages should describe the advantages
and disadvantages of an option: little more disclosure would really help
here.

IMHO, if you're having trouble describing the pros and cons of an option
to a user of average technical ability (think sysadmin), it's not a good
config option.

Rusty.
--
Anyone who quotes me in their signature is an idiot -- Rusty Russell


2004-03-23 17:16:52

by John Stoffel

[permalink] [raw]
Subject: Re: arch/i386/Kconfig: CONFIG_IRQBALANCE Description


And hey, under 2.6.5-rc2-mm1, it doens't seem to do anything:

> cat /proc/version
Linux version 2.6.5-rc2-mm1 (john@jfsnew) (gcc version 3.3.3 (Debian 20040314)) #3 SMP Sun Mar 21 15:26:27 EST 2004

> zcat /proc/config.gz | grep IRQ
CONFIG_IRQBALANCE=y
CONFIG_IDEPCI_SHARE_IRQ=y

> cat /proc/interrupts
CPU0 CPU1
0: 46272316 487 IO-APIC-edge timer
1: 376 0 IO-APIC-edge i8042
2: 0 0 XT-PIC cascade
4: 4147 1 IO-APIC-edge serial
7: 2 0 IO-APIC-edge parport0
8: 4 0 IO-APIC-edge rtc
11: 95936 1 IO-APIC-edge Cyclom-Y
12: 677 0 IO-APIC-edge i8042
14: 87 0 IO-APIC-edge ide0
16: 46770 3 IO-APIC-level ide2, ide3, ehci_hcd
17: 307832 1 IO-APIC-level eth0
18: 118258 1 IO-APIC-level aic7xxx, aic7xxx, ohci_hcd
19: 0 0 IO-APIC-level ohci_hcd
NMI: 0 0
LOC: 46279245 46279281
ERR: 0
MIS: 417


John

2004-03-23 21:30:18

by Miquel van Smoorenburg

[permalink] [raw]
Subject: Re: arch/i386/Kconfig: CONFIG_IRQBALANCE Description

In article <[email protected]>,
John Stoffel <[email protected]> wrote:
>
>And hey, under 2.6.5-rc2-mm1, it doens't seem to do anything:
>
> > zcat /proc/config.gz | grep IRQ
> CONFIG_IRQBALANCE=y
> CONFIG_IDEPCI_SHARE_IRQ=y
>
> > cat /proc/interrupts
> CPU0 CPU1
> 0: 46272316 487 IO-APIC-edge timer
> 1: 376 0 IO-APIC-edge i8042
> 16: 46770 3 IO-APIC-level ide2, ide3, ehci_hcd
> 17: 307832 1 IO-APIC-level eth0
> 18: 118258 1 IO-APIC-level aic7xxx, aic7xxx, ohci_hcd
> LOC: 46279245 46279281

Is that real SMP, or hyperthreading? If it's hyperthreading, then
it makes sense that the IRQs are not balanced.

In fact I have a server on which the IRQ balancing code does
balance over the 2 virtual CPUs by accident (still have to debug
what goes wrong and file a proper bug report) and as a result
performance sucked until I turned it off.

Mike.
--
Netu, v qba'g yvxr gur cynvagrkg :)

2004-03-24 14:04:58

by John Stoffel

[permalink] [raw]
Subject: Re: arch/i386/Kconfig: CONFIG_IRQBALANCE Description

>>>>> "Miquel" == Miquel van Smoorenburg <[email protected]> writes:

Miquel> In article <[email protected]>,
Miquel> John Stoffel <[email protected]> wrote:
>>
>> And hey, under 2.6.5-rc2-mm1, it doens't seem to do anything:
>>
>> > zcat /proc/config.gz | grep IRQ
>> CONFIG_IRQBALANCE=y
>> CONFIG_IDEPCI_SHARE_IRQ=y
>>
>> > cat /proc/interrupts
>> CPU0 CPU1
>> 0: 46272316 487 IO-APIC-edge timer
>> 1: 376 0 IO-APIC-edge i8042
>> 16: 46770 3 IO-APIC-level ide2, ide3, ehci_hcd
>> 17: 307832 1 IO-APIC-level eth0
>> 18: 118258 1 IO-APIC-level aic7xxx, aic7xxx, ohci_hcd
>> LOC: 46279245 46279281

Miquel> Is that real SMP, or hyperthreading? If it's hyperthreading,
Miquel> then it makes sense that the IRQs are not balanced.

It's dual Xeon PIII 550mhz, in a Dell Precision 610MT workstation.
Intel GX chipset. None of that fancy HT stuff here!

> cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 7
model name : Pentium III (Katmai)
stepping : 3
cpu MHz : 547.343
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse
bogomips : 1077.24

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 7
model name : Pentium III (Katmai)
stepping : 3
cpu MHz : 547.343
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse
bogomips : 1093.63

2004-03-25 19:07:25

by Jamie Lokier

[permalink] [raw]
Subject: Re: arch/i386/Kconfig: CONFIG_IRQBALANCE Description

Miquel van Smoorenburg wrote:
> Is that real SMP, or hyperthreading? If it's hyperthreading, then
> it makes sense that the IRQs are not balanced.

That's unfair to the two tasks which might be running on each virtual
CPU: one of the tasks is interrupted often.

> In fact I have a server on which the IRQ balancing code does
> balance over the 2 virtual CPUs by accident (still have to debug
> what goes wrong and file a proper bug report) and as a result
> performance sucked until I turned it off.

What caused the suckage? Obviously there's a small time spend doing
the work of rebalancing, but there is no cache hit from moving an
interrupt between virtual CPUs, unlike with SMP, so why did that make
performance suck?

-- Jamie

2004-03-26 10:39:50

by Miquel van Smoorenburg

[permalink] [raw]
Subject: Re: arch/i386/Kconfig: CONFIG_IRQBALANCE Description

On Thu, 25 Mar 2004 20:07:18, Jamie Lokier wrote:
> Miquel van Smoorenburg wrote:
> > Is that real SMP, or hyperthreading? If it's hyperthreading, then
> > it makes sense that the IRQs are not balanced.
>
> That's unfair to the two tasks which might be running on each virtual
> CPU: one of the tasks is interrupted often.
>
> > In fact I have a server on which the IRQ balancing code does
> > balance over the 2 virtual CPUs by accident (still have to debug
> > what goes wrong and file a proper bug report) and as a result
> > performance sucked until I turned it off.
>
> What caused the suckage? Obviously there's a small time spend doing
> the work of rebalancing, but there is no cache hit from moving an
> interrupt between virtual CPUs, unlike with SMP, so why did that make
> performance suck?

I have no idea. I have a transit NNTP server, newsgate.cistron.nl, that
has a acenic GigE card, 1 Maxtor ATA 80 GB for the O/S, and 4 Maxtor
SATA 80 GB for database and storage. Sustained input is 100 mbit/sec,
sustained output is 250-300 mbit/sec.

It rans fine with 2.6.0-test11, but not any later kernels. I tried 2.6.2,
2.6.3 etc but somehow the output wouldn't get above 100 mbit/sec. Then
I noticed that with the 2.6.0-test11 kernel IRQs weren't balanced over
the 2 SMT cores while with later kernels they were.

So I installed a 2.6.4-rc2 kernel. Bad performance. Did a
"echo 1 > /proc/irq/XX/smp_affinity" for the NIC and IO interrupts,
and performance went bang right back to what it was before.

Yesterday I rebooted with 2.6.4-rc2, but with the "noirqbalance" option.
That didn't really perform well. So I rebooted again with 2.6.5-rc2
without the "noirqbalance" option. It appeared to run better, but not
quite up to par. Then I did the "echo 1 > /proc/irq/XX/smp_affinity"
for the NIC and IO interrupts again. Output traffic peaked again.

If you look at
http://newsgate.cistron.nl/lkml/stats.cistron.nl/mrtg/html/quantum.html
you can see the effect on the bandwidth stats:

25-03 before 14:00 2.6.4-rc2 with "echo 1 > /proc/irq/XX/smp_affinity
25-03 16:00 2.6.4-rc2 with noirqbalance
26-03 01:00 2.6.5-rc2 irq balancing enabled
26-03 10:30 2.6.5-rc2 with "echo 1 > /proc/irq/XX/smp_affinity

In my case, it looks like the box runs best with only IRQ balancing
for the timer interrupt over the 2 SMT cores, and no IRQ balancing
for all the other hardware.

I have no idea _why_ this affects throughput so much - the box itself
doesn't "feel" any different wrt latency on the shell, or load average.
It's just throughput, and I don't even know if this is disk controller
or NIC related.

Now since the box was down for 2 hours yesterday, it also has a
large backlog to process. I really have to retry this once it has been
running for a few days in a stable state, that's why I haven't really
dug into it yet, circumstances have been changing too much.

Mike.