2019-08-22 20:00:52

by Neil Horman

[permalink] [raw]
Subject: [PATCH V2] x86: Add irq spillover warning

On Intel hardware, cpus are limited in the number of irqs they can
have affined to them (currently 240), based on section 10.5.2 of:
https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3a-part-1-manual.pdf

assign_irq_vector_any_locked() will attempt to honor the affining
request, but if the 240 vector limitation documented above is crossed, a
new mask will be selected that is potentially outside the requested cpu
set silently. This can lead to unexpected behavior for administrators.

Mitigate this problem by checking the affinity mask after its been
assigned in activate_reserved so that adminstrators get a logged warning
about the change.

Tested successfully by the reporter

Change Notes:
V1->V2)
* Moved the check for this condition to activate_reserved from
do_IRQ, taking it out of the hot path (request by [email protected])

Signed-off-by: Neil Horman <[email protected]>
Reported-by: [email protected]
CC: Thomas Gleixner <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: Borislav Petkov <[email protected]>
CC: "H. Peter Anvin" <[email protected]>
CC: [email protected]
---
arch/x86/kernel/apic/vector.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index fdacb864c3dd..b8ed0406d41f 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -398,6 +398,16 @@ static int activate_reserved(struct irq_data *irqd)
if (!irqd_can_reserve(irqd))
apicd->can_reserve = false;
}
+
+ /*
+ * Check to ensure that the effective affinity mask is a subset
+ * the user supplied affinity mask, and warn the user if it is not
+ */
+ if (!cpumask_subset(irq_data_get_effective_affinity_mask(irqd),
+ irq_data_get_affinity_mask(irqd)))
+ pr_warn("irq %d has been assigned to a cpu outside of its user affinity mask\n",
+ irqd->irq);
+
return ret;
}

--
2.21.0


2019-08-28 12:49:32

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH V2] x86: Add irq spillover warning

Neil,

On Thu, 22 Aug 2019, Neil Horman wrote:

Just a few nits.

> On Intel hardware, cpus are limited in the number of irqs they can
> have affined to them (currently 240), based on section 10.5.2 of:
> https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3a-part-1-manual.pdf
>

Please do not add links to URLs which are not reliable and sections which
have a different number in the next version of the document. Just cite the
3 relevant lines or transcribe them.

Aside of that the above is not really accurate:

1) This is not restricted to Intel. All x86 CPUs of all vendors behave
that way.

2) CPUs have a vector space of 256. CPUs reserve 32 vectors (0..31). That
leaves 224. The kernel reserves another 22 vectors for various purposes.

That leaves 202 vectors for assignment to devices and that's what this
is about.

> assign_irq_vector_any_locked() will attempt to honor the affining
> request, but if the 240 vector limitation documented above is crossed, a

that means the vector space is exhausted.

> new mask will be selected that is potentially outside the requested cpu

It's not potentially outside. The point is that the requested affinity mask
has no vectors left, so it falls back to a wider cpumask and is guaranteed
to select a CPU which is not in the requested affinity mask.

> set silently. This can lead to unexpected behavior for administrators.
>
> Mitigate this problem by checking the affinity mask after its been
> assigned in activate_reserved so that adminstrators get a logged warning
> about the change.

Please do not describe implementation details which can be seen from the
patch itself.

> Tested successfully by the reporter

We have a 'Tested-by:' tag for this

> Change Notes:
> V1->V2)
> * Moved the check for this condition to activate_reserved from
> do_IRQ, taking it out of the hot path (request by [email protected])

Please put change notes (and thanks for providing them) below the '---'
discard line. They are not part of the final changelog in git. They are
informative for the reviewers, but if in the changelog it's manual work to
remove them while the discard section goes away automatically.

> +
> + /*
> + * Check to ensure that the effective affinity mask is a subset
> + * the user supplied affinity mask, and warn the user if it is not
> + */
> + if (!cpumask_subset(irq_data_get_effective_affinity_mask(irqd),
> + irq_data_get_affinity_mask(irqd)))
> + pr_warn("irq %d has been assigned to a cpu outside of its user affinity mask\n",

s/%d/%u/ irqd->irq is unsigned int.

So you tell what, but no hint about the why. That should be:

"irq %u: Affinity broken due to vector space exhaustion.\n"

That actually tells that it happened and gives the administrator
information why. So he knows that he tried to assign too many interrupts to
a set of CPUs.

> + irqd->irq);

This is a multiline statement and wants brackets around it.

> +
> return ret;

I fixed that all up while applying, so no need to resend.

Thanks,

tglx

Subject: [tip: x86/apic] x86/apic/vector: Warn when vector space exhaustion breaks affinity

The following commit has been merged into the x86/apic branch of tip:

Commit-ID: 743dac494d61d991967ebcfab92e4f80dc7583b3
Gitweb: https://git.kernel.org/tip/743dac494d61d991967ebcfab92e4f80dc7583b3
Author: Neil Horman <[email protected]>
AuthorDate: Thu, 22 Aug 2019 10:34:21 -04:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Wed, 28 Aug 2019 14:44:08 +02:00

x86/apic/vector: Warn when vector space exhaustion breaks affinity

On x86, CPUs are limited in the number of interrupts they can have affined
to them as they only support 256 interrupt vectors per CPU. 32 vectors are
reserved for the CPU and the kernel reserves another 22 for internal
purposes. That leaves 202 vectors for assignement to devices.

When an interrupt is set up or the affinity is changed by the kernel or the
administrator, the vector assignment code attempts to honor the requested
affinity mask. If the vector space on the CPUs in that affinity mask is
exhausted the code falls back to a wider set of CPUs and assigns a vector
on a CPU outside of the requested affinity mask silently.

While the effective affinity is reflected in the corresponding
/proc/irq/$N/effective_affinity* files the silent breakage of the requested
affinity can lead to unexpected behaviour for administrators.

Add a pr_warn() when this happens so that adminstrators get at least
informed about it in the syslog.

[ tglx: Massaged changelog and made the pr_warn() more informative ]

Reported-by: [email protected]
Signed-off-by: Neil Horman <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/kernel/apic/vector.c | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index fdacb86..2c5676b 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -398,6 +398,17 @@ static int activate_reserved(struct irq_data *irqd)
if (!irqd_can_reserve(irqd))
apicd->can_reserve = false;
}
+
+ /*
+ * Check to ensure that the effective affinity mask is a subset
+ * the user supplied affinity mask, and warn the user if it is not
+ */
+ if (!cpumask_subset(irq_data_get_effective_affinity_mask(irqd),
+ irq_data_get_affinity_mask(irqd))) {
+ pr_warn("irq %u: Affinity broken due to vector space exhaustion.\n",
+ irqd->irq);
+ }
+
return ret;
}