2023-03-28 07:32:34

by Saurabh Singh Sengar

[permalink] [raw]
Subject: [PATCH v2] x86/ioapic: Don't return 0 from arch_dynirq_lower_bound()

arch_dynirq_lower_bound() is invoked by the core interrupt code to
retrieve the lowest possible Linux interrupt number for dynamically
allocated interrupts like MSI.

The x86 implementation uses this to exclude the IO/APIC GSI space.
This works correctly as long as there is an IO/APIC registered, but
returns 0 if not. This has been observed in VMs where the BIOS does
not advertise an IO/APIC.

0 is an invalid interrupt number except for the legacy timer interrupt
on x86. The return value is unchecked in the core code, so it ends up
to allocate interrupt number 0 which is subsequently considered to be
invalid by the caller, e.g. the MSI allocation code.

The function has already a check for 0 in the case that an IO/APIC is
registered, but ioapic_dynirq_base is 0 in case of device tree setups.

Consolidate this and zero check for both ioapic_dynirq_base and gsi_top,
which is used in the case that no IO/APIC is registered.

Fixes: 3e5bedc2c258 ("x86/apic: Fix arch_dynirq_lower_bound() bug for DT enabled machines")
Co-developed-by: Thomas Gleixner <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Saurabh Sengar <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Thomas Gleixner <[email protected]>
---
[V2]
- Edit commit message
- Consolidated the 0 check for ioapic_dynirq_base as well

arch/x86/kernel/apic/io_apic.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 1f83b052bb74..f980b38b0227 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2477,17 +2477,21 @@ static int io_apic_get_redir_entries(int ioapic)

unsigned int arch_dynirq_lower_bound(unsigned int from)
{
+ unsigned int ret;
+
/*
* dmar_alloc_hwirq() may be called before setup_IO_APIC(), so use
* gsi_top if ioapic_dynirq_base hasn't been initialized yet.
*/
- if (!ioapic_initialized)
- return gsi_top;
+ ret = ioapic_dynirq_base ? : gsi_top;
+
/*
- * For DT enabled machines ioapic_dynirq_base is irrelevant and not
- * updated. So simply return @from if ioapic_dynirq_base == 0.
+ * For DT enabled machines ioapic_dynirq_base is irrelevant and
+ * always 0. gsi_top can be 0 if there is no IO/APIC registered.
+ * 0 is an invalid interrupt number for dynamic allocations. Return
+ * @from instead.
*/
- return ioapic_dynirq_base ? : from;
+ return ret ? : from;
}

#ifdef CONFIG_X86_32
--
2.34.1


2023-03-28 14:06:04

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH v2] x86/ioapic: Don't return 0 from arch_dynirq_lower_bound()

On March 28, 2023 12:30:04 AM PDT, Saurabh Sengar <[email protected]> wrote:
>arch_dynirq_lower_bound() is invoked by the core interrupt code to
>retrieve the lowest possible Linux interrupt number for dynamically
>allocated interrupts like MSI.
>
>The x86 implementation uses this to exclude the IO/APIC GSI space.
>This works correctly as long as there is an IO/APIC registered, but
>returns 0 if not. This has been observed in VMs where the BIOS does
>not advertise an IO/APIC.
>
>0 is an invalid interrupt number except for the legacy timer interrupt
>on x86. The return value is unchecked in the core code, so it ends up
>to allocate interrupt number 0 which is subsequently considered to be
>invalid by the caller, e.g. the MSI allocation code.
>
>The function has already a check for 0 in the case that an IO/APIC is
>registered, but ioapic_dynirq_base is 0 in case of device tree setups.
>
>Consolidate this and zero check for both ioapic_dynirq_base and gsi_top,
>which is used in the case that no IO/APIC is registered.
>
>Fixes: 3e5bedc2c258 ("x86/apic: Fix arch_dynirq_lower_bound() bug for DT enabled machines")
>Co-developed-by: Thomas Gleixner <[email protected]>
>Signed-off-by: Thomas Gleixner <[email protected]>
>Signed-off-by: Saurabh Sengar <[email protected]>
>Cc: Andy Shevchenko <[email protected]>
>Cc: Thomas Gleixner <[email protected]>
>---
>[V2]
>- Edit commit message
>- Consolidated the 0 check for ioapic_dynirq_base as well
>
> arch/x86/kernel/apic/io_apic.c | 14 +++++++++-----
> 1 file changed, 9 insertions(+), 5 deletions(-)
>
>diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
>index 1f83b052bb74..f980b38b0227 100644
>--- a/arch/x86/kernel/apic/io_apic.c
>+++ b/arch/x86/kernel/apic/io_apic.c
>@@ -2477,17 +2477,21 @@ static int io_apic_get_redir_entries(int ioapic)
>
> unsigned int arch_dynirq_lower_bound(unsigned int from)
> {
>+ unsigned int ret;
>+
> /*
> * dmar_alloc_hwirq() may be called before setup_IO_APIC(), so use
> * gsi_top if ioapic_dynirq_base hasn't been initialized yet.
> */
>- if (!ioapic_initialized)
>- return gsi_top;
>+ ret = ioapic_dynirq_base ? : gsi_top;
>+
> /*
>- * For DT enabled machines ioapic_dynirq_base is irrelevant and not
>- * updated. So simply return @from if ioapic_dynirq_base == 0.
>+ * For DT enabled machines ioapic_dynirq_base is irrelevant and
>+ * always 0. gsi_top can be 0 if there is no IO/APIC registered.
>+ * 0 is an invalid interrupt number for dynamic allocations. Return
>+ * @from instead.
> */
>- return ioapic_dynirq_base ? : from;
>+ return ret ? : from;
> }
>
> #ifdef CONFIG_X86_32

Is there any reason why this variable can't be initialized to a fixed nonzero number, like 16?

2023-03-28 14:50:25

by Saurabh Singh Sengar

[permalink] [raw]
Subject: Re: [PATCH v2] x86/ioapic: Don't return 0 from arch_dynirq_lower_bound()

On Tue, Mar 28, 2023 at 06:59:04AM -0700, H. Peter Anvin wrote:
> On March 28, 2023 12:30:04 AM PDT, Saurabh Sengar <[email protected]> wrote:
> >arch_dynirq_lower_bound() is invoked by the core interrupt code to
> >retrieve the lowest possible Linux interrupt number for dynamically
> >allocated interrupts like MSI.
> >
> >The x86 implementation uses this to exclude the IO/APIC GSI space.
> >This works correctly as long as there is an IO/APIC registered, but
> >returns 0 if not. This has been observed in VMs where the BIOS does
> >not advertise an IO/APIC.
> >
> >0 is an invalid interrupt number except for the legacy timer interrupt
> >on x86. The return value is unchecked in the core code, so it ends up
> >to allocate interrupt number 0 which is subsequently considered to be
> >invalid by the caller, e.g. the MSI allocation code.
> >
> >The function has already a check for 0 in the case that an IO/APIC is
> >registered, but ioapic_dynirq_base is 0 in case of device tree setups.
> >
> >Consolidate this and zero check for both ioapic_dynirq_base and gsi_top,
> >which is used in the case that no IO/APIC is registered.
> >
> >Fixes: 3e5bedc2c258 ("x86/apic: Fix arch_dynirq_lower_bound() bug for DT enabled machines")
> >Co-developed-by: Thomas Gleixner <[email protected]>
> >Signed-off-by: Thomas Gleixner <[email protected]>
> >Signed-off-by: Saurabh Sengar <[email protected]>
> >Cc: Andy Shevchenko <[email protected]>
> >Cc: Thomas Gleixner <[email protected]>
> >---
> >[V2]
> >- Edit commit message
> >- Consolidated the 0 check for ioapic_dynirq_base as well
> >
> > arch/x86/kernel/apic/io_apic.c | 14 +++++++++-----
> > 1 file changed, 9 insertions(+), 5 deletions(-)
> >
> >diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> >index 1f83b052bb74..f980b38b0227 100644
> >--- a/arch/x86/kernel/apic/io_apic.c
> >+++ b/arch/x86/kernel/apic/io_apic.c
> >@@ -2477,17 +2477,21 @@ static int io_apic_get_redir_entries(int ioapic)
> >
> > unsigned int arch_dynirq_lower_bound(unsigned int from)
> > {
> >+ unsigned int ret;
> >+
> > /*
> > * dmar_alloc_hwirq() may be called before setup_IO_APIC(), so use
> > * gsi_top if ioapic_dynirq_base hasn't been initialized yet.
> > */
> >- if (!ioapic_initialized)
> >- return gsi_top;
> >+ ret = ioapic_dynirq_base ? : gsi_top;
> >+
> > /*
> >- * For DT enabled machines ioapic_dynirq_base is irrelevant and not
> >- * updated. So simply return @from if ioapic_dynirq_base == 0.
> >+ * For DT enabled machines ioapic_dynirq_base is irrelevant and
> >+ * always 0. gsi_top can be 0 if there is no IO/APIC registered.
> >+ * 0 is an invalid interrupt number for dynamic allocations. Return
> >+ * @from instead.
> > */
> >- return ioapic_dynirq_base ? : from;
> >+ return ret ? : from;
> > }
> >
> > #ifdef CONFIG_X86_32
>
> Is there any reason why this variable can't be initialized to a fixed nonzero number, like 16?

Yes, initializing gst_top to any non-zero value should fix this issue.
At first I thought to intialize gst_top to 1. But then I looked at how
the ioapic_dynirq_base case is handled and followed a similar approach.

Regards,
Saurabh

2023-04-11 06:10:31

by Saurabh Singh Sengar

[permalink] [raw]
Subject: Re: [PATCH v2] x86/ioapic: Don't return 0 from arch_dynirq_lower_bound()

On Tue, Mar 28, 2023 at 12:30:04AM -0700, Saurabh Sengar wrote:
> arch_dynirq_lower_bound() is invoked by the core interrupt code to
> retrieve the lowest possible Linux interrupt number for dynamically
> allocated interrupts like MSI.
>
> The x86 implementation uses this to exclude the IO/APIC GSI space.
> This works correctly as long as there is an IO/APIC registered, but
> returns 0 if not. This has been observed in VMs where the BIOS does
> not advertise an IO/APIC.
>
> 0 is an invalid interrupt number except for the legacy timer interrupt
> on x86. The return value is unchecked in the core code, so it ends up
> to allocate interrupt number 0 which is subsequently considered to be
> invalid by the caller, e.g. the MSI allocation code.
>
> The function has already a check for 0 in the case that an IO/APIC is
> registered, but ioapic_dynirq_base is 0 in case of device tree setups.
>
> Consolidate this and zero check for both ioapic_dynirq_base and gsi_top,
> which is used in the case that no IO/APIC is registered.
>
> Fixes: 3e5bedc2c258 ("x86/apic: Fix arch_dynirq_lower_bound() bug for DT enabled machines")
> Co-developed-by: Thomas Gleixner <[email protected]>
> Signed-off-by: Thomas Gleixner <[email protected]>
> Signed-off-by: Saurabh Sengar <[email protected]>
> Cc: Andy Shevchenko <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> ---
> [V2]
> - Edit commit message
> - Consolidated the 0 check for ioapic_dynirq_base as well
>
> arch/x86/kernel/apic/io_apic.c | 14 +++++++++-----
> 1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> index 1f83b052bb74..f980b38b0227 100644
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -2477,17 +2477,21 @@ static int io_apic_get_redir_entries(int ioapic)
>
> unsigned int arch_dynirq_lower_bound(unsigned int from)
> {
> + unsigned int ret;
> +
> /*
> * dmar_alloc_hwirq() may be called before setup_IO_APIC(), so use
> * gsi_top if ioapic_dynirq_base hasn't been initialized yet.
> */
> - if (!ioapic_initialized)
> - return gsi_top;
> + ret = ioapic_dynirq_base ? : gsi_top;
> +
> /*
> - * For DT enabled machines ioapic_dynirq_base is irrelevant and not
> - * updated. So simply return @from if ioapic_dynirq_base == 0.
> + * For DT enabled machines ioapic_dynirq_base is irrelevant and
> + * always 0. gsi_top can be 0 if there is no IO/APIC registered.
> + * 0 is an invalid interrupt number for dynamic allocations. Return
> + * @from instead.
> */
> - return ioapic_dynirq_base ? : from;
> + return ret ? : from;
> }
>
> #ifdef CONFIG_X86_32

Is this good to get accepted ? Please let me know if anything pending from my end on this.

- Saurabh

> --
> 2.34.1

2023-04-12 15:45:39

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2] x86/ioapic: Don't return 0 from arch_dynirq_lower_bound()

On Tue, Mar 28 2023 at 07:48, Saurabh Singh Sengar wrote:
> On Tue, Mar 28, 2023 at 06:59:04AM -0700, H. Peter Anvin wrote:
>>
>> Is there any reason why this variable can't be initialized to a fixed nonzero number, like 16?
>
> Yes, initializing gst_top to any non-zero value should fix this issue.
> At first I thought to intialize gst_top to 1.

That works only in your case. Some boot time registrations of IO_APICs
use gsi_top as the base. So initializing gsi_top to N would move IOAPIC[0]
interrupts out to irq N... and make the legacy interrupts fail.

That whole IOAPIC registration could do with some major cleanup, but
that's a different story.

Thanks,

tglx

Subject: [tip: x86/apic] x86/ioapic: Don't return 0 from arch_dynirq_lower_bound()

The following commit has been merged into the x86/apic branch of tip:

Commit-ID: 5af507bef93c09a94fb8f058213b489178f4cbe5
Gitweb: https://git.kernel.org/tip/5af507bef93c09a94fb8f058213b489178f4cbe5
Author: Saurabh Sengar <[email protected]>
AuthorDate: Tue, 28 Mar 2023 00:30:04 -07:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Wed, 12 Apr 2023 17:45:50 +02:00

x86/ioapic: Don't return 0 from arch_dynirq_lower_bound()

arch_dynirq_lower_bound() is invoked by the core interrupt code to
retrieve the lowest possible Linux interrupt number for dynamically
allocated interrupts like MSI.

The x86 implementation uses this to exclude the IO/APIC GSI space.
This works correctly as long as there is an IO/APIC registered, but
returns 0 if not. This has been observed in VMs where the BIOS does
not advertise an IO/APIC.

0 is an invalid interrupt number except for the legacy timer interrupt
on x86. The return value is unchecked in the core code, so it ends up
to allocate interrupt number 0 which is subsequently considered to be
invalid by the caller, e.g. the MSI allocation code.

The function has already a check for 0 in the case that an IO/APIC is
registered, as ioapic_dynirq_base is 0 in case of device tree setups.

Consolidate this and zero check for both ioapic_dynirq_base and gsi_top,
which is used in the case that no IO/APIC is registered.

Fixes: 3e5bedc2c258 ("x86/apic: Fix arch_dynirq_lower_bound() bug for DT enabled machines")
Signed-off-by: Saurabh Sengar <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

---
arch/x86/kernel/apic/io_apic.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 1f83b05..f980b38 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2477,17 +2477,21 @@ static int io_apic_get_redir_entries(int ioapic)

unsigned int arch_dynirq_lower_bound(unsigned int from)
{
+ unsigned int ret;
+
/*
* dmar_alloc_hwirq() may be called before setup_IO_APIC(), so use
* gsi_top if ioapic_dynirq_base hasn't been initialized yet.
*/
- if (!ioapic_initialized)
- return gsi_top;
+ ret = ioapic_dynirq_base ? : gsi_top;
+
/*
- * For DT enabled machines ioapic_dynirq_base is irrelevant and not
- * updated. So simply return @from if ioapic_dynirq_base == 0.
+ * For DT enabled machines ioapic_dynirq_base is irrelevant and
+ * always 0. gsi_top can be 0 if there is no IO/APIC registered.
+ * 0 is an invalid interrupt number for dynamic allocations. Return
+ * @from instead.
*/
- return ioapic_dynirq_base ? : from;
+ return ret ? : from;
}

#ifdef CONFIG_X86_32