2010-06-30 06:08:19

by Hari LKML

[permalink] [raw]
Subject: Interrupt Affinity in SMP

Hi All,

We have HP DL380 Server with 48GB of DDR3/1333 MHz RAM, 4 1Gbps NIC
Ports, 1 External RAID Card, 1 Internal RAID Card and 2 ?Intel(R)
Xeon(R) CPU X5570 ?@ 2.93GHz used as an NFS storage server. Running
Ubuntu 9.10 2.6.31-14-server.
(The Storage is used as an NFS storage for more than 300VMs running
using Citrix XenServer)

For More info on the Hard ware configuration
http://h18000.www1.hp.com/products/quickspecs/12028_div/12028_div.html

With HTT enabled in BIOS i can see 16 CPU(0-15) as per cat /proc/cpuinfo.

The problem is all the interrupts are handled by CPU0, as per
/proc/interrupts including the timer interrupts.
And the value of /proc/irq default_smp_affinity is ffff.
Also for all the interrupts the smp_affinity is ffff.

What i believe is with these setting all the interrupts should be
handled by the First 8 CPUs in a Round robin Fashion.
Or in our case at worse by the first 4 cores (since HTT is enabled -
*not clear with this concept though some once please put some light
over here also for better understanding :-)* ).

Also i tried to edit the smp_affinity of an interrupt to a value 0004
to test that if the interrupt is handled by some other core. But i was
unable to edit this value in a running system i got the access denied
error so i can't do this.

Please help me in distributing the interrupts among the other cores of
my system which might increase some performance of my system.

let me know if you need some log of the system or some other info
about the system.

Hari


2010-07-09 22:59:18

by Bryan Hundven

[permalink] [raw]
Subject: Re: Interrupt Affinity in SMP

Mauro, list,

(please CC me in replies, I am not on the lkml list).

I am having a similar issue as:
http://lkml.indiana.edu/hypermail/linux/kernel/1006.3/01811.html
which has no responses.

My company is using an evaluation of dual 5645 xeon board with the 5520 chipset.
Attached is my kernel config.

We have 12 82580 intel (igb) nics, and their affinity is set to
0xffffffff, but I see all interrupts happening on cpu0:
=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====
root@(none):~# cat /proc/irq/85/smp_affinity
ffffff
root@(none):~# cat /proc/irq/86/smp_affinity
ffffff
root@(none):~# cat /proc/irq/87/smp_affinity
ffffff
=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====

=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====
root@(none):~# cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3 CPU4
CPU5 CPU6 CPU7 CPU8 CPU9 CPU10
CPU11 CPU12 CPU13 CPU14 CPU15 CPU16
CPU17 CPU18 CPU19 CPU20 CPU21 CPU22
CPU23
0: 70 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
IO-APIC-edge timer
1: 3 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
IO-APIC-edge i8042
4: 4282 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
IO-APIC-edge serial
8: 1 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
IO-APIC-edge rtc0
9: 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
IO-APIC-fasteoi acpi
12: 3 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
IO-APIC-edge i8042
14: 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
IO-APIC-edge ata_piix
15: 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
IO-APIC-edge ata_piix
16: 438 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
IO-APIC-fasteoi pata_it8213
18: 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
IO-APIC-fasteoi uhci_hcd:usb4
19: 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
IO-APIC-fasteoi ata_piix, uhci_hcd:usb3
23: 26 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
IO-APIC-fasteoi ehci_hcd:usb1, uhci_hcd:usb2
64: 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge aerdrv
65: 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge aerdrv
66: 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge aerdrv
67: 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge aerdrv
68: 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge aerdrv
69: 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge aerdrv
70: 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge aerdrv
71: 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge aerdrv
72: 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge aerdrv
73: 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge aerdrv
76: 3 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge ioat-msix
77: 3 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge ioat-msix
78: 3 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge ioat-msix
79: 3 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge ioat-msix
80: 3 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge ioat-msix
81: 3 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge ioat-msix
82: 3 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge ioat-msix
83: 3 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge ioat-msix
84: 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge pkp_dev
85: 1 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth0
86: 47 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth0-TxRx-0
87: 47 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth0-TxRx-1
88: 47 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth0-TxRx-2
89: 47 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth0-TxRx-3
90: 47 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth0-TxRx-4
91: 47 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth0-TxRx-5
92: 47 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth0-TxRx-6
93: 47 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth0-TxRx-7
94: 1 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth1
95: 47 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth1-TxRx-0
96: 47 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth1-TxRx-1
97: 47 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth1-TxRx-2
98: 47 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth1-TxRx-3
99: 47 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth1-TxRx-4
100: 47 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth1-TxRx-5
101: 47 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth1-TxRx-6
102: 47 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth1-TxRx-7
103: 1 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth2
104: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth2-TxRx-0
105: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth2-TxRx-1
106: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth2-TxRx-2
107: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth2-TxRx-3
108: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth2-TxRx-4
109: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth2-TxRx-5
110: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth2-TxRx-6
111: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth2-TxRx-7
112: 1 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth3
113: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth3-TxRx-0
114: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth3-TxRx-1
115: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth3-TxRx-2
116: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth3-TxRx-3
117: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth3-TxRx-4
118: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth3-TxRx-5
119: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth3-TxRx-6
120: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth3-TxRx-7
121: 1 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth4
122: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth4-TxRx-0
123: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth4-TxRx-1
124: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth4-TxRx-2
125: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth4-TxRx-3
126: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth4-TxRx-4
127: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth4-TxRx-5
128: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth4-TxRx-6
129: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth4-TxRx-7
130: 1 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth5
131: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth5-TxRx-0
132: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth5-TxRx-1
133: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth5-TxRx-2
134: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth5-TxRx-3
135: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth5-TxRx-4
136: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth5-TxRx-5
137: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth5-TxRx-6
138: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth5-TxRx-7
139: 1 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth6
140: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth6-TxRx-0
141: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth6-TxRx-1
142: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth6-TxRx-2
143: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth6-TxRx-3
144: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth6-TxRx-4
145: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth6-TxRx-5
146: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth6-TxRx-6
147: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth6-TxRx-7
148: 1 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth7
149: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth7-TxRx-0
150: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth7-TxRx-1
151: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth7-TxRx-2
152: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth7-TxRx-3
153: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth7-TxRx-4
154: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth7-TxRx-5
155: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth7-TxRx-6
156: 46 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
PCI-MSI-edge eth7-TxRx-7
NMI: 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
Non-maskable interrupts
LOC: 45670 45920 45859 45793 45791
45755 45720 45685 45796 45615 45580
45545 45503 45474 45436 45405 45369
45334 45299 45264 45229 45194 45159
45106 Local timer interrupts
SPU: 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0 Spurious
interrupts
PMI: 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
Performance monitoring interrupts
PND: 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
Performance pending work
RES: 109 2 1 3 0
1 0 1 0 1 0 1
8 1 47 1 0 1
0 1 0 1 0 8
Rescheduling interrupts
CAL: 5 316 132 130 128
126 124 122 120 118 116 114
112 110 108 106 104 102
100 98 96 94 92 86 Function
call interrupts
TLB: 13 6 4 0 1
0 0 0 0 0 0 0
24 0 13 28 1 0
0 0 0 0 0 0 TLB
shootdowns
TRM: 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0 Thermal
event interrupts
THR: 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0 Threshold
APIC interrupts
MCE: 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0 Machine
check exceptions
MCP: 2 2 2 2 2
2 2 2 2 2 2 2
2 2 2 2 2 2
2 2 2 2 2 2 Machine
check polls
ERR: 0
MIS: 0
=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====

I am happy testing patches, and running any unit-tests.

Thanks in advance,

-Bryan


Attachments:
kernel.config (54.66 kB)

2010-07-10 00:48:42

by Robert Hancock

[permalink] [raw]
Subject: Re: Interrupt Affinity in SMP

On 07/09/2010 04:59 PM, Bryan Hundven wrote:
> Mauro, list,
>
> (please CC me in replies, I am not on the lkml list).
>
> I am having a similar issue as:
> http://lkml.indiana.edu/hypermail/linux/kernel/1006.3/01811.html
> which has no responses.
>
> My company is using an evaluation of dual 5645 xeon board with the 5520 chipset.
> Attached is my kernel config.
>
> We have 12 82580 intel (igb) nics, and their affinity is set to
> 0xffffffff, but I see all interrupts happening on cpu0:
> =====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====
> root@(none):~# cat /proc/irq/85/smp_affinity
> ffffff
> root@(none):~# cat /proc/irq/86/smp_affinity
> ffffff
> root@(none):~# cat /proc/irq/87/smp_affinity
> ffffff

Tried changing these files to exclude CPU0?

Have you tried running the irqbalance daemon? That's what you likely
want to be doing anyway..

> =====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====
>
> =====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====
> root@(none):~# cat /proc/interrupts
> CPU0 CPU1 CPU2 CPU3 CPU4
> CPU5 CPU6 CPU7 CPU8 CPU9 CPU10
> CPU11 CPU12 CPU13 CPU14 CPU15 CPU16
> CPU17 CPU18 CPU19 CPU20 CPU21 CPU22
> CPU23
> 0: 70 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> IO-APIC-edge timer
> 1: 3 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> IO-APIC-edge i8042
> 4: 4282 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> IO-APIC-edge serial
> 8: 1 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> IO-APIC-edge rtc0
> 9: 0 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> IO-APIC-fasteoi acpi
> 12: 3 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> IO-APIC-edge i8042
> 14: 0 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> IO-APIC-edge ata_piix
> 15: 0 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> IO-APIC-edge ata_piix
> 16: 438 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> IO-APIC-fasteoi pata_it8213
> 18: 0 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> IO-APIC-fasteoi uhci_hcd:usb4
> 19: 0 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> IO-APIC-fasteoi ata_piix, uhci_hcd:usb3
> 23: 26 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> IO-APIC-fasteoi ehci_hcd:usb1, uhci_hcd:usb2
> 64: 0 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge aerdrv
> 65: 0 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge aerdrv
> 66: 0 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge aerdrv
> 67: 0 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge aerdrv
> 68: 0 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge aerdrv
> 69: 0 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge aerdrv
> 70: 0 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge aerdrv
> 71: 0 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge aerdrv
> 72: 0 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge aerdrv
> 73: 0 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge aerdrv
> 76: 3 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge ioat-msix
> 77: 3 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge ioat-msix
> 78: 3 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge ioat-msix
> 79: 3 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge ioat-msix
> 80: 3 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge ioat-msix
> 81: 3 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge ioat-msix
> 82: 3 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge ioat-msix
> 83: 3 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge ioat-msix
> 84: 0 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge pkp_dev
> 85: 1 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth0
> 86: 47 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth0-TxRx-0
> 87: 47 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth0-TxRx-1
> 88: 47 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth0-TxRx-2
> 89: 47 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth0-TxRx-3
> 90: 47 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth0-TxRx-4
> 91: 47 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth0-TxRx-5
> 92: 47 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth0-TxRx-6
> 93: 47 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth0-TxRx-7
> 94: 1 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth1
> 95: 47 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth1-TxRx-0
> 96: 47 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth1-TxRx-1
> 97: 47 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth1-TxRx-2
> 98: 47 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth1-TxRx-3
> 99: 47 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth1-TxRx-4
> 100: 47 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth1-TxRx-5
> 101: 47 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth1-TxRx-6
> 102: 47 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth1-TxRx-7
> 103: 1 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth2
> 104: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth2-TxRx-0
> 105: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth2-TxRx-1
> 106: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth2-TxRx-2
> 107: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth2-TxRx-3
> 108: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth2-TxRx-4
> 109: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth2-TxRx-5
> 110: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth2-TxRx-6
> 111: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth2-TxRx-7
> 112: 1 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth3
> 113: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth3-TxRx-0
> 114: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth3-TxRx-1
> 115: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth3-TxRx-2
> 116: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth3-TxRx-3
> 117: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth3-TxRx-4
> 118: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth3-TxRx-5
> 119: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth3-TxRx-6
> 120: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth3-TxRx-7
> 121: 1 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth4
> 122: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth4-TxRx-0
> 123: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth4-TxRx-1
> 124: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth4-TxRx-2
> 125: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth4-TxRx-3
> 126: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth4-TxRx-4
> 127: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth4-TxRx-5
> 128: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth4-TxRx-6
> 129: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth4-TxRx-7
> 130: 1 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth5
> 131: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth5-TxRx-0
> 132: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth5-TxRx-1
> 133: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth5-TxRx-2
> 134: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth5-TxRx-3
> 135: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth5-TxRx-4
> 136: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth5-TxRx-5
> 137: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth5-TxRx-6
> 138: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth5-TxRx-7
> 139: 1 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth6
> 140: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth6-TxRx-0
> 141: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth6-TxRx-1
> 142: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth6-TxRx-2
> 143: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth6-TxRx-3
> 144: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth6-TxRx-4
> 145: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth6-TxRx-5
> 146: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth6-TxRx-6
> 147: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth6-TxRx-7
> 148: 1 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth7
> 149: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth7-TxRx-0
> 150: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth7-TxRx-1
> 151: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth7-TxRx-2
> 152: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth7-TxRx-3
> 153: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth7-TxRx-4
> 154: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth7-TxRx-5
> 155: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth7-TxRx-6
> 156: 46 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> PCI-MSI-edge eth7-TxRx-7
> NMI: 0 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> Non-maskable interrupts
> LOC: 45670 45920 45859 45793 45791
> 45755 45720 45685 45796 45615 45580
> 45545 45503 45474 45436 45405 45369
> 45334 45299 45264 45229 45194 45159
> 45106 Local timer interrupts
> SPU: 0 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0 Spurious
> interrupts
> PMI: 0 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> Performance monitoring interrupts
> PND: 0 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0
> Performance pending work
> RES: 109 2 1 3 0
> 1 0 1 0 1 0 1
> 8 1 47 1 0 1
> 0 1 0 1 0 8
> Rescheduling interrupts
> CAL: 5 316 132 130 128
> 126 124 122 120 118 116 114
> 112 110 108 106 104 102
> 100 98 96 94 92 86 Function
> call interrupts
> TLB: 13 6 4 0 1
> 0 0 0 0 0 0 0
> 24 0 13 28 1 0
> 0 0 0 0 0 0 TLB
> shootdowns
> TRM: 0 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0 Thermal
> event interrupts
> THR: 0 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0 Threshold
> APIC interrupts
> MCE: 0 0 0 0 0
> 0 0 0 0 0 0 0
> 0 0 0 0 0 0
> 0 0 0 0 0 0 Machine
> check exceptions
> MCP: 2 2 2 2 2
> 2 2 2 2 2 2 2
> 2 2 2 2 2 2
> 2 2 2 2 2 2 Machine
> check polls
> ERR: 0
> MIS: 0
> =====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====
>
> I am happy testing patches, and running any unit-tests.
>
> Thanks in advance,
>
> -Bryan

2010-07-11 01:21:01

by Robert Hancock

[permalink] [raw]
Subject: Re: Interrupt Affinity in SMP

On Sat, Jul 10, 2010 at 1:46 PM, Bryan Hundven <[email protected]> wrote:
> I was able to set eth0 and it's TxRx queues to cpu1, but it is my
> understanding that 0xFFFFFFFF should distribute the interrupts across all
> cpus, much like LOC in my output of /proc/interrupts.
>
> I don't have access to the computer this weekend, but I will provide more
> info on Monday.

That may be chipset dependent, I don't think all chipsets have the
ability to distribute the interrupts like that. Round-robin interrupt
distribution for a given handler isn't optimal for performance anyway
since it causes the relevant cache lines for the interrupt handler to
be ping-ponged between the different CPUs.

>
> -bryan
>
> On Jul 9, 2010 5:48 PM, "Robert Hancock" <[email protected]> wrote:
>
> On 07/09/2010 04:59 PM, Bryan Hundven wrote:
>>
>> Mauro, list,
>>
>> (please CC me in replies, I am not...
>
> Tried changing these files to exclude CPU0?
>
> Have you tried running the irqbalance daemon? That's what you likely want to
> be doing anyway..
>
>> =====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====
>>
>> =====8<=====8<=====8<==...

2010-07-17 20:02:18

by Bryan Hundven

[permalink] [raw]
Subject: Re: Interrupt Affinity in SMP

On Sat, Jul 10, 2010 at 6:20 PM, Robert Hancock <[email protected]> wrote:
> On Sat, Jul 10, 2010 at 1:46 PM, Bryan Hundven <[email protected]> wrote:
>> I was able to set eth0 and it's TxRx queues to cpu1, but it is my
>> understanding that 0xFFFFFFFF should distribute the interrupts across all
>> cpus, much like LOC in my output of /proc/interrupts.
>>
>> I don't have access to the computer this weekend, but I will provide more
>> info on Monday.
>
> That may be chipset dependent, I don't think all chipsets have the
> ability to distribute the interrupts like that. Round-robin interrupt
> distribution for a given handler isn't optimal for performance anyway
> since it causes the relevant cache lines for the interrupt handler to
> be ping-ponged between the different CPUs.
>
>>
>> -bryan
>>
>> On Jul 9, 2010 5:48 PM, "Robert Hancock" <[email protected]> wrote:
>>
>> On 07/09/2010 04:59 PM, Bryan Hundven wrote:
>>>
>>> Mauro, list,
>>>
>>> (please CC me in replies, I am not...
>>
>> Tried changing these files to exclude CPU0?
>>
>> Have you tried running the irqbalance daemon? That's what you likely want to
>> be doing anyway..
>>
>>> =====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====
>>>
>>> =====8<=====8<=====8<==...
>

Please see the two attached examples.

Notice on the 5410 example how we start with the affinity set to 0xff,
and change it to 0xf0.
This should spread the interrupts over the last 4 cores of this quad
core - dual processor system.

Also notice on the 5645 example, with the same commands we start with
0xffffff and change to 0xfff000 to spread the interrupts over the last
12 cores, but only the first of the last twelve cores receive
interrupts.

This is the inconsistency I was trying to explain before.

--Bryan


Attachments:
xeon5410-2.6.32.txt (1.84 kB)
xeon5645-2.6.35-rc4.txt (2.91 kB)
Download all attachments

2010-07-18 18:38:25

by Ciju Rajan K

[permalink] [raw]
Subject: Re: Interrupt Affinity in SMP

Bryan Hundven wrote:
> On Sat, Jul 10, 2010 at 6:20 PM, Robert Hancock <[email protected]> wrote:
>
>> On Sat, Jul 10, 2010 at 1:46 PM, Bryan Hundven <[email protected]> wrote:
>>
>>> I was able to set eth0 and it's TxRx queues to cpu1, but it is my
>>> understanding that 0xFFFFFFFF should distribute the interrupts across all
>>> cpus, much like LOC in my output of /proc/interrupts.
>>>
>>> I don't have access to the computer this weekend, but I will provide more
>>> info on Monday.
>>>
>> That may be chipset dependent, I don't think all chipsets have the
>> ability to distribute the interrupts like that. Round-robin interrupt
>> distribution for a given handler isn't optimal for performance anyway
>> since it causes the relevant cache lines for the interrupt handler to
>> be ping-ponged between the different CPUs.
>>
>>
>>> -bryan
>>>
>>> On Jul 9, 2010 5:48 PM, "Robert Hancock" <[email protected]> wrote:
>>>
>>> On 07/09/2010 04:59 PM, Bryan Hundven wrote:
>>>
>>>> Mauro, list,
>>>>
>>>> (please CC me in replies, I am not...
>>>>
>>> Tried changing these files to exclude CPU0?
>>>
>>> Have you tried running the irqbalance daemon? That's what you likely want to
>>> be doing anyway..
>>>
>>>
>>>> =====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====
>>>>
>>>> =====8<=====8<=====8<==...
>>>>
>
> Please see the two attached examples.
>
> Notice on the 5410 example how we start with the affinity set to 0xff,
> and change it to 0xf0.
> This should spread the interrupts over the last 4 cores of this quad
> core - dual processor system.
>
> Also notice on the 5645 example, with the same commands we start with
> 0xffffff and change to 0xfff000 to spread the interrupts over the last
> 12 cores, but only the first of the last twelve cores receive
> interrupts.
>
> This is the inconsistency I was trying to explain before.
>
What was the status of irqbalance daemon? Was it turned on? If it is
running, there is a chance that the interrupt count is within the
threshold limit and interrupts are not being routed to the other core.
Could you also try with increasing the interrupt load and see if the
distribution is happening among the cores?

-Ciju
> --Bryan
>

2010-07-18 18:52:50

by Bryan Hundven

[permalink] [raw]
Subject: Re: Interrupt Affinity in SMP

On Sun, Jul 18, 2010 at 11:38 AM, Ciju Rajan K <[email protected]> wrote:
> Bryan Hundven wrote:
>>
>> On Sat, Jul 10, 2010 at 6:20 PM, Robert Hancock <[email protected]>
>> wrote:
>>
>>>
>>> On Sat, Jul 10, 2010 at 1:46 PM, Bryan Hundven <[email protected]>
>>> wrote:
>>>
>>>>
>>>> I was able to set eth0 and it's TxRx queues to cpu1, but it is my
>>>> understanding that 0xFFFFFFFF should distribute the interrupts across
>>>> all
>>>> cpus, much like LOC in my output of /proc/interrupts.
>>>>
>>>> I don't have access to the computer this weekend, but I will provide
>>>> more
>>>> info on Monday.
>>>>
>>>
>>> That may be chipset dependent, I don't think all chipsets have the
>>> ability to distribute the interrupts like that. Round-robin interrupt
>>> distribution for a given handler isn't optimal for performance anyway
>>> since it causes the relevant cache lines for the interrupt handler to
>>> be ping-ponged between the different CPUs.
>>>
>>>
>>>>
>>>> -bryan
>>>>
>>>> On Jul 9, 2010 5:48 PM, "Robert Hancock" <[email protected]> wrote:
>>>>
>>>> On 07/09/2010 04:59 PM, Bryan Hundven wrote:
>>>>
>>>>>
>>>>> Mauro, list,
>>>>>
>>>>> (please CC me in replies, I am not...
>>>>>
>>>>
>>>> Tried changing these files to exclude CPU0?
>>>>
>>>> Have you tried running the irqbalance daemon? That's what you likely
>>>> want to
>>>> be doing anyway..
>>>>
>>>>
>>>>>
>>>>> =====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====
>>>>>
>>>>> =====8<=====8<=====8<==...
>>>>>
>>
>> Please see the two attached examples.
>>
>> Notice on the 5410 example how we start with the affinity set to 0xff,
>> and change it to 0xf0.
>> This should spread the interrupts over the last 4 cores of this quad
>> core - dual processor system.
>>
>> Also notice on the 5645 example, with the same commands we start with
>> 0xffffff and change to 0xfff000 to spread the interrupts over the last
>> 12 cores, but only the first of the last twelve cores receive
>> interrupts.
>>
>> This is the inconsistency I was trying to explain before.
>>
>
> What was the status of irqbalance daemon? Was it turned on? If it is
> running, there is a chance that the interrupt count is within the threshold
> limit and interrupts are not being routed to the other core.

irqbalance daemon was not running on either setup.

> Could you also try with increasing the interrupt load and see if the
> distribution is happening among the cores?

We use spirent testcenter l2/l3 test equipment and pushed 100%
throughput with the same distribution. Nothing changed.

This isn't affecting just ethernet drivers. I have also seen the same
issues with hardware encryption devices and other hardware that gets a
software interrupt.

--Bryan

>
> -Ciju
>>
>> --Bryan
>>
>
>



--
Bryan Hundven
[email protected]

2010-07-18 19:22:49

by Ciju Rajan K

[permalink] [raw]
Subject: Re: Interrupt Affinity in SMP

Bryan Hundven wrote:
> On Sun, Jul 18, 2010 at 11:38 AM, Ciju Rajan K <[email protected]> wrote:
>
>> Bryan Hundven wrote:
>>
>>> On Sat, Jul 10, 2010 at 6:20 PM, Robert Hancock <[email protected]>
>>> wrote:
>>>
>>>
>>>> On Sat, Jul 10, 2010 at 1:46 PM, Bryan Hundven <[email protected]>
>>>> wrote:
>>>>
>>>>
>>>>> I was able to set eth0 and it's TxRx queues to cpu1, but it is my
>>>>> understanding that 0xFFFFFFFF should distribute the interrupts across
>>>>> all
>>>>> cpus, much like LOC in my output of /proc/interrupts.
>>>>>
>>>>> I don't have access to the computer this weekend, but I will provide
>>>>> more
>>>>> info on Monday.
>>>>>
>>>>>
>>>> That may be chipset dependent, I don't think all chipsets have the
>>>> ability to distribute the interrupts like that. Round-robin interrupt
>>>> distribution for a given handler isn't optimal for performance anyway
>>>> since it causes the relevant cache lines for the interrupt handler to
>>>> be ping-ponged between the different CPUs.
>>>>
>>>>
>>>>
>>>>> -bryan
>>>>>
>>>>> On Jul 9, 2010 5:48 PM, "Robert Hancock" <[email protected]> wrote:
>>>>>
>>>>> On 07/09/2010 04:59 PM, Bryan Hundven wrote:
>>>>>
>>>>>
>>>>>> Mauro, list,
>>>>>>
>>>>>> (please CC me in replies, I am not...
>>>>>>
>>>>>>
>>>>> Tried changing these files to exclude CPU0?
>>>>>
>>>>> Have you tried running the irqbalance daemon? That's what you likely
>>>>> want to
>>>>> be doing anyway..
>>>>>
>>>>>
>>>>>
>>>>>> =====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====
>>>>>>
>>>>>> =====8<=====8<=====8<==...
>>>>>>
>>>>>>
>>> Please see the two attached examples.
>>>
>>> Notice on the 5410 example how we start with the affinity set to 0xff,
>>> and change it to 0xf0.
>>> This should spread the interrupts over the last 4 cores of this quad
>>> core - dual processor system.
>>>
>>> Also notice on the 5645 example, with the same commands we start with
>>> 0xffffff and change to 0xfff000 to spread the interrupts over the last
>>> 12 cores, but only the first of the last twelve cores receive
>>> interrupts.
>>>
>>> This is the inconsistency I was trying to explain before.
>>>
>>>
>> What was the status of irqbalance daemon? Was it turned on? If it is
>> running, there is a chance that the interrupt count is within the threshold
>> limit and interrupts are not being routed to the other core.
>>
>
> irqbalance daemon was not running on either setup.
>
>
>> Could you also try with increasing the interrupt load and see if the
>> distribution is happening among the cores?
>>
>
> We use spirent testcenter l2/l3 test equipment and pushed 100%
> throughput with the same distribution. Nothing changed.
>
In the example that you have given, I could see just 7 interrupts after
15 seconds.
So thought of checking it. Let me try to see this problem locally.

-Ciju
> This isn't affecting just ethernet drivers. I have also seen the same
> issues with hardware encryption devices and other hardware that gets a
> software interrupt.
>
> --Bryan
>
>
>> -Ciju
>>
>>> --Bryan
>>>
>>>
>>
>
>
>
>

2010-07-19 17:01:52

by Bryan Hundven

[permalink] [raw]
Subject: Re: Interrupt Affinity in SMP

Again, I can set a TxRx interrupt to a specific core and this works
fine, but when I try to set that same TxRx interrupt to a set of
cores/processors - interrupts only occur on the first core/processor
of the set.

On Sun, Jul 18, 2010 at 12:22 PM, Ciju Rajan K <[email protected]> wrote:
> Bryan Hundven wrote:
>>
>> On Sun, Jul 18, 2010 at 11:38 AM, Ciju Rajan K <[email protected]>
>> wrote:
>>
>>>
>>> Bryan Hundven wrote:
>>>
>>>>
>>>> On Sat, Jul 10, 2010 at 6:20 PM, Robert Hancock <[email protected]>
>>>> wrote:
>>>>
>>>>
>>>>>
>>>>> On Sat, Jul 10, 2010 at 1:46 PM, Bryan Hundven <[email protected]>
>>>>> wrote:
>>>>>
>>>>>
>>>>>>
>>>>>> I was able to set eth0 and it's TxRx queues to cpu1, but it is my
>>>>>> understanding that 0xFFFFFFFF should distribute the interrupts across
>>>>>> all
>>>>>> cpus, much like LOC in my output of /proc/interrupts.
>>>>>>
>>>>>> I don't have access to the computer this weekend, but I will provide
>>>>>> more
>>>>>> info on Monday.
>>>>>>
>>>>>>
>>>>>
>>>>> That may be chipset dependent, I don't think all chipsets have the
>>>>> ability to distribute the interrupts like that. Round-robin interrupt
>>>>> distribution for a given handler isn't optimal for performance anyway
>>>>> since it causes the relevant cache lines for the interrupt handler to
>>>>> be ping-ponged between the different CPUs.
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> -bryan
>>>>>>
>>>>>> On Jul 9, 2010 5:48 PM, "Robert Hancock" <[email protected]> wrote:
>>>>>>
>>>>>> On 07/09/2010 04:59 PM, Bryan Hundven wrote:
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Mauro, list,
>>>>>>>
>>>>>>> (please CC me in replies, I am not...
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Tried changing these files to exclude CPU0?
>>>>>>
>>>>>> Have you tried running the irqbalance daemon? That's what you likely
>>>>>> want to
>>>>>> be doing anyway..
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> =====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====8<=====
>>>>>>>
>>>>>>> =====8<=====8<=====8<==...
>>>>>>>
>>>>>>>
>>>>
>>>> Please see the two attached examples.
>>>>
>>>> Notice on the 5410 example how we start with the affinity set to 0xff,
>>>> and change it to 0xf0.
>>>> This should spread the interrupts over the last 4 cores of this quad
>>>> core - dual processor system.
>>>>
>>>> Also notice on the 5645 example, with the same commands we start with
>>>> 0xffffff and change to 0xfff000 to spread the interrupts over the last
>>>> 12 cores, but only the first of the last twelve cores receive
>>>> interrupts.
>>>>
>>>> This is the inconsistency I was trying to explain before.
>>>>
>>>>
>>>
>>> What was the status of irqbalance daemon? Was it turned on? If it is
>>> running, there is a chance that the interrupt count is within the
>>> threshold
>>> limit and interrupts are not being routed to the other core.
>>>
>>
>> irqbalance daemon was not running on either setup.
>>
>>
>>>
>>> Could you also try with increasing the interrupt load and see if the
>>> distribution is happening among the cores?
>>>
>>
>> We use spirent testcenter l2/l3 test equipment and pushed 100%
>> throughput with the same distribution. Nothing changed.
>>
>
> In the example that you have given, I could see just 7 interrupts after 15
> seconds.
> So thought of checking it. Let me try to see this problem locally.
>
> -Ciju
>>
>> This isn't affecting just ethernet drivers. I have also seen the same
>> issues with hardware encryption devices and other hardware that gets a
>> software interrupt.
>>
>> --Bryan
>>
>>
>>>
>>> -Ciju
>>>
>>>>
>>>> --Bryan
>>>>
>>>>
>>>
>>>
>>
>>
>>
>>
>
>



--
Bryan Hundven
[email protected]

2010-07-19 19:22:17

by Robert Hancock

[permalink] [raw]
Subject: Re: Interrupt Affinity in SMP

On Sat, Jul 17, 2010 at 2:02 PM, Bryan Hundven <[email protected]> wrote:
> Please see the two attached examples.
>
> Notice on the 5410 example how we start with the affinity set to 0xff,
> and change it to 0xf0.
> This should spread the interrupts over the last 4 cores of this quad
> core - dual processor system.
>
> Also notice on the 5645 example, with the same commands we start with
> 0xffffff and change to 0xfff000 to spread the interrupts over the last
> 12 cores, but only the first of the last twelve cores receive
> interrupts.
>
> This is the inconsistency I was trying to explain before.

As I mentioned before, I believe the behavior in this case is chipset
dependent, and potentially not something the kernel has control over.
In most cases distributing the same interrupt across multiple cores is
likely a bad idea in any case, unless the interrupt load is actually
so high that a single CPU can't handle it.

2010-07-19 20:03:36

by Bryan Hundven

[permalink] [raw]
Subject: Re: Interrupt Affinity in SMP

On Mon, Jul 19, 2010 at 12:22 PM, Robert Hancock <[email protected]> wrote:
> On Sat, Jul 17, 2010 at 2:02 PM, Bryan Hundven <[email protected]> wrote:
>> Please see the two attached examples.
>>
>> Notice on the 5410 example how we start with the affinity set to 0xff,
>> and change it to 0xf0.
>> This should spread the interrupts over the last 4 cores of this quad
>> core - dual processor system.
>>
>> Also notice on the 5645 example, with the same commands we start with
>> 0xffffff and change to 0xfff000 to spread the interrupts over the last
>> 12 cores, but only the first of the last twelve cores receive
>> interrupts.
>>
>> This is the inconsistency I was trying to explain before.
>
> As I mentioned before, I believe the behavior in this case is chipset
> dependent, and potentially not something the kernel has control over.
> In most cases distributing the same interrupt across multiple cores is
> likely a bad idea in any case, unless the interrupt load is actually
> so high that a single CPU can't handle it.

Can anyone confirm that this is how the 5520 and newer xeon chipsets
handle affinity?

I might be totally wrong, but I thought that RSS (Receive Side
Scaling, which is available on the 82576 network card in that 5645
xeon example) helps in that scenario?

--Bryan

2010-07-19 21:33:18

by Robert Hancock

[permalink] [raw]
Subject: Re: Interrupt Affinity in SMP

On Mon, Jul 19, 2010 at 2:03 PM, Bryan Hundven <[email protected]> wrote:
> On Mon, Jul 19, 2010 at 12:22 PM, Robert Hancock <[email protected]> wrote:
>> On Sat, Jul 17, 2010 at 2:02 PM, Bryan Hundven <[email protected]> wrote:
>>> Please see the two attached examples.
>>>
>>> Notice on the 5410 example how we start with the affinity set to 0xff,
>>> and change it to 0xf0.
>>> This should spread the interrupts over the last 4 cores of this quad
>>> core - dual processor system.
>>>
>>> Also notice on the 5645 example, with the same commands we start with
>>> 0xffffff and change to 0xfff000 to spread the interrupts over the last
>>> 12 cores, but only the first of the last twelve cores receive
>>> interrupts.
>>>
>>> This is the inconsistency I was trying to explain before.
>>
>> As I mentioned before, I believe the behavior in this case is chipset
>> dependent, and potentially not something the kernel has control over.
>> In most cases distributing the same interrupt across multiple cores is
>> likely a bad idea in any case, unless the interrupt load is actually
>> so high that a single CPU can't handle it.
>
> Can anyone confirm that this is how the 5520 and newer xeon chipsets
> handle affinity?
>
> I might be totally wrong, but I thought that RSS (Receive Side
> Scaling, which is available on the 82576 network card in that 5645
> xeon example) helps in that scenario?

It looks like that card gives you multiple interrupt vectors that can
be serviced independently by multiple CPUs. However, each interrupt
vector is likely to only be handled by one CPU at a time.