Under high IO activity (storage or network), the kernel is not
accounting some cpu cycles, comparing sar vs emon (tool that accesses hw
pmu directly). The difference is higher on cores that spend most time on
idle state and are constantly waking up to handle interrupts. It happens
even with fine IRQ time accounting enabled (CONFIG_IRQ_TIME_ACCOUNTING).
After playing with timer subsytem options (periodick ticks, idle
tickless, full tickless), time and stats accounting, and jiffie values,
the issue persists. Cycles lost are not accounted on other cores as
'extra' util. Example with linux 4.15.18 baremetal, xeon v4 broadwell,
driving network traffic:
sar emon emon-sar intrs/sec
core12 5.00 11.70 6.70 29,302
core17 19.07 23.16 4.09 17,345
core20 19.41 23.11 3.70 16,578
Based on how kernel accounts time:
Do you have an idea why a high number of intrs affect time accounting?
Thanks,
-Solio