From: Ma Jun <[email protected]>
When the CPU of a non-balanced irq bounded is off line, the irq will be migrated to other CPUs,
usually the first cpu on-line.
We can suppose the situation if a system has more than one non-balanced irq.
At extreme case, these irqs will be migrated to the same CPU and will cause the
CPU run with high irq pressure, even make the system die.
So, I think maybe we need to change the non-balanced irq to a irq can be
balanced to avoid the problem descried above.
Maybe this is not a good solution for this problem, please offer me some
suggestion if you have a better one.
Signed-off-by: Ma Jun <[email protected]>
---
kernel/irq/cpuhotplug.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c
index 011f8c4..80d54a5 100644
--- a/kernel/irq/cpuhotplug.c
+++ b/kernel/irq/cpuhotplug.c
@@ -30,6 +30,8 @@ static bool migrate_one_irq(struct irq_desc *desc)
return false;
if (cpumask_any_and(affinity, cpu_online_mask) >= nr_cpu_ids) {
+ if (irq_settings_has_no_balance_set(desc))
+ irqd_clear(d, IRQD_NO_BALANCING);
affinity = cpu_online_mask;
ret = true;
}
--
1.7.1
Hi Ma Jun,
On 01/04/16 04:28, MaJun wrote:
> From: Ma Jun <[email protected]>
>
> When the CPU of a non-balanced irq bounded is off line, the irq will be migrated to other CPUs,
> usually the first cpu on-line.
>
> We can suppose the situation if a system has more than one non-balanced irq.
> At extreme case, these irqs will be migrated to the same CPU and will cause the
> CPU run with high irq pressure, even make the system die.
It would take a hell of lot of interrupts (and a very badly designed
system) for that system to collapse under the interrupt load. Whatever
people tend to think, interrupts are a very rare event.
Any moderately ancient CPU can take several hundred of thousand
interrupts per second, and you still barely notice it (try any embedded
platform with a bunch of MMC controllers...).
Now, let's get to the actual question:
> So, I think maybe we need to change the non-balanced irq to a irq can be
> balanced to avoid the problem descried above.
But what makes you think that you can safely clear that flag? If it has
been excluding from balancing, that's surely for a good reason, and the
device driver that requested this probably doesn't expect the interrupt
affinity to change, other than by the effect of CPU hotplug itself.
So if you're seeing a problem with an interrupt not being balanced,
please first investigate *why* the driver asked for it the first place.
But to the best of my understanding, this patch doesn't solve anything.
Thanks,
N,
--
Jazz is not dead. It just smells funny...