2023-03-02 19:34:57

by Saurabh Sengar

[permalink] [raw]
Subject: [PATCH] x86/ioapic: Don't return 0 as valid virq

Zero is invalid virq and should't be returned as a valid value for
lower irq bound. If IO-APIC and gsi_top are not initialized return
the 'from' value as virq.

Signed-off-by: Saurabh Sengar <[email protected]>
---
arch/x86/kernel/apic/io_apic.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index a868b76cd3d4..000cc6159b8b 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2483,9 +2483,11 @@ unsigned int arch_dynirq_lower_bound(unsigned int from)
/*
* dmar_alloc_hwirq() may be called before setup_IO_APIC(), so use
* gsi_top if ioapic_dynirq_base hasn't been initialized yet.
+ *
+ * Incase gsi_top is also not initialized return @from.
*/
if (!ioapic_initialized)
- return gsi_top;
+ return gsi_top ? : from;
/*
* For DT enabled machines ioapic_dynirq_base is irrelevant and not
* updated. So simply return @from if ioapic_dynirq_base == 0.
--
2.34.1



2023-03-09 17:17:45

by Saurabh Sengar

[permalink] [raw]
Subject: Re: [PATCH] x86/ioapic: Don't return 0 as valid virq

On Thu, Mar 02, 2023 at 11:34:46AM -0800, Saurabh Sengar wrote:
> Zero is invalid virq and should't be returned as a valid value for
> lower irq bound. If IO-APIC and gsi_top are not initialized return
> the 'from' value as virq.
>
> Signed-off-by: Saurabh Sengar <[email protected]>
> ---
> arch/x86/kernel/apic/io_apic.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> index a868b76cd3d4..000cc6159b8b 100644
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -2483,9 +2483,11 @@ unsigned int arch_dynirq_lower_bound(unsigned int from)
> /*
> * dmar_alloc_hwirq() may be called before setup_IO_APIC(), so use
> * gsi_top if ioapic_dynirq_base hasn't been initialized yet.
> + *
> + * Incase gsi_top is also not initialized return @from.
> */
> if (!ioapic_initialized)
> - return gsi_top;
> + return gsi_top ? : from;
> /*
> * For DT enabled machines ioapic_dynirq_base is irrelevant and not
> * updated. So simply return @from if ioapic_dynirq_base == 0.
> --
> 2.34.1

Hi Maintainers,

May I get your feedback on this patch.

Regards,
Saurabh

2023-03-12 20:40:30

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] x86/ioapic: Don't return 0 as valid virq

On Thu, Mar 02, 2023 at 11:34:46AM -0800, Saurabh Sengar wrote:
> Zero is invalid virq and should't be returned as a valid value for
> lower irq bound. If IO-APIC and gsi_top are not initialized return

Why isn't gsi_top initialized?

What is this fixing?

Don't be afraid to do

git annotate arch/x86/kernel/apic/io_apic.c

and see which commit added this. This one:

3e5bedc2c258 ("x86/apic: Fix arch_dynirq_lower_bound() bug for DT enabled machines")

Now add the folks from this commit to Cc and tell them why in your case
gsi_top is not initialized and what they're breaking by doing that.

The more your commit message explains *why* you're fixing something, the
better it is for the maintainers/reviewers to actually know what to do.

Right now I'm reading this and I'm thinking, random, unjustified change.
Ignore.

Ok?

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2023-03-13 03:29:57

by Saurabh Singh Sengar

[permalink] [raw]
Subject: RE: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq

Cc: [email protected], [email protected]

Thanks for you comments, please see my responses below.

> -----Original Message-----
> From: Borislav Petkov <[email protected]>
> Sent: Monday, March 13, 2023 2:10 AM
> To: Saurabh Sengar <[email protected]>
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; Michael Kelley (LINUX)
> <[email protected]>; [email protected]
> Subject: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq
>
> On Thu, Mar 02, 2023 at 11:34:46AM -0800, Saurabh Sengar wrote:
> > Zero is invalid virq and should't be returned as a valid value for
> > lower irq bound. If IO-APIC and gsi_top are not initialized return
>
> Why isn't gsi_top initialized?
>
> What is this fixing?

In the absence of a device tree node for IO-APIC, IO-APIC is not registered,
resulting in uninitialized gsi_top. And in such cases arch_dynirq_lower_bound
will return 0. Returning 0 from this function will allow interrupts to have 0
assigned as valid irq, which is wrong. In case gsi_top is 0, lower bound of irq
should be derived from 'hint' value passed to function as 'from'.

I can add above info in commit message, please let me know if anything
more to be added.

To be specific in our system which is a guest VM we don't need IO-APIC and hence
there is no device tree node for it. It is observed that we get irq 0 assigned to PCI-MSI.

>
> Don't be afraid to do
>
> git annotate arch/x86/kernel/apic/io_apic.c
>
> and see which commit added this. This one:
>
> 3e5bedc2c258 ("x86/apic: Fix arch_dynirq_lower_bound() bug for DT enabled
> machines")
>
> Now add the folks from this commit to Cc and tell them why in your case
> gsi_top is not initialized and what they're breaking by doing that.

Thanks. I will add "Fixes:" and "Cc:" tag in next version.

>
> The more your commit message explains *why* you're fixing something, the
> better it is for the maintainers/reviewers to actually know what to do.
>
> Right now I'm reading this and I'm thinking, random, unjustified change.
> Ignore.
>
> Ok?
>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeopl
> e.kernel.org%2Ftglx%2Fnotes-about-
> netiquette&data=05%7C01%7Cssengar%40microsoft.com%7C6e1e0e21051c4
> 9c1cfe008db233a0376%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%
> 7C638142504360574969%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjA
> wMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C
> %7C&sdata=OLdgb1AuLbLvlzucgNFBQEEK6G%2FsFV%2BO2TqT%2FNCujJU%3
> D&reserved=0

2023-03-13 03:38:00

by Saurabh Singh Sengar

[permalink] [raw]
Subject: Re: [PATCH] x86/ioapic: Don't return 0 as valid virq

I just see mail to [email protected] is undelivered, shall I still add it in "Cc:" ?
Please let me know what we usually do in such cases.

Regards,
Saurabh


> -----Original Message-----
> From: Saurabh Singh Sengar <[email protected]>
> Sent: Monday, March 13, 2023 9:00 AM
> To: Borislav Petkov <[email protected]>; Saurabh Sengar
> <[email protected]>
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; Michael Kelley (LINUX)
> <[email protected]>; [email protected];
> [email protected]; [email protected]
> Subject: RE: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq
>
> Cc: [email protected], [email protected]
>
> Thanks for you comments, please see my responses below.
>
> > -----Original Message-----
> > From: Borislav Petkov <[email protected]>
> > Sent: Monday, March 13, 2023 2:10 AM
> > To: Saurabh Sengar <[email protected]>
> > Cc: [email protected]; [email protected]; [email protected];
> > [email protected]; [email protected]; [email protected];
> > [email protected]; Michael Kelley (LINUX)
> > <[email protected]>; [email protected]
> > Subject: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid
> > virq
> >
> > On Thu, Mar 02, 2023 at 11:34:46AM -0800, Saurabh Sengar wrote:
> > > Zero is invalid virq and should't be returned as a valid value for
> > > lower irq bound. If IO-APIC and gsi_top are not initialized return
> >
> > Why isn't gsi_top initialized?
> >
> > What is this fixing?
>
> In the absence of a device tree node for IO-APIC, IO-APIC is not registered,
> resulting in uninitialized gsi_top. And in such cases arch_dynirq_lower_bound
> will return 0. Returning 0 from this function will allow interrupts to have 0
> assigned as valid irq, which is wrong. In case gsi_top is 0, lower bound of irq
> should be derived from 'hint' value passed to function as 'from'.
>
> I can add above info in commit message, please let me know if anything
> more to be added.
>
> To be specific in our system which is a guest VM we don't need IO-APIC and
> hence there is no device tree node for it. It is observed that we get irq 0
> assigned to PCI-MSI.
>
> >
> > Don't be afraid to do
> >
> > git annotate arch/x86/kernel/apic/io_apic.c
> >
> > and see which commit added this. This one:
> >
> > 3e5bedc2c258 ("x86/apic: Fix arch_dynirq_lower_bound() bug for DT
> > enabled
> > machines")
> >
> > Now add the folks from this commit to Cc and tell them why in your
> > case gsi_top is not initialized and what they're breaking by doing that.
>
> Thanks. I will add "Fixes:" and "Cc:" tag in next version.
>
> >
> > The more your commit message explains *why* you're fixing something,
> > the better it is for the maintainers/reviewers to actually know what to do.
> >
> > Right now I'm reading this and I'm thinking, random, unjustified change.
> > Ignore.
> >
> > Ok?
> >
> > Thx.
> >
> > --
> > Regards/Gruss,
> > Boris.
> >
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeop
> >
> l%2F&data=05%7C01%7Cssengar%40microsoft.com%7C0595a41023f849c5ee
> b308db
> >
> 23732cd0%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638142749
> 8888650
> >
> 08%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzI
> iLCJBTiI
> >
> 6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qpOM5MMYpUof
> VOaNsp8HxTmv
> > %2B80iVn5rFfzNQTlTwLw%3D&reserved=0
> > e.kernel.org%2Ftglx%2Fnotes-about-
> >
> netiquette&data=05%7C01%7Cssengar%40microsoft.com%7C6e1e0e21051c4
> >
> 9c1cfe008db233a0376%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%
> >
> 7C638142504360574969%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjA
> >
> wMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C
> >
> %7C&sdata=OLdgb1AuLbLvlzucgNFBQEEK6G%2FsFV%2BO2TqT%2FNCujJU%3
> > D&reserved=0

2023-03-13 11:07:31

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] x86/ioapic: Don't return 0 as valid virq

On Mon, Mar 13, 2023 at 03:37:44AM +0000, Saurabh Singh Sengar wrote:
> I just see mail to [email protected] is undelivered, shall I still add it in "Cc:" ?
> Please let me know what we usually do in such cases.

You were posting just fine inline, why do you have to top-post now?

If it is undelivered, then the address is probably invalid now so drop
it.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2023-03-13 11:20:04

by Borislav Petkov

[permalink] [raw]
Subject: Re: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq

On Mon, Mar 13, 2023 at 03:29:32AM +0000, Saurabh Singh Sengar wrote:
> To be specific in our system which is a guest VM we don't need IO-APIC and hence
> there is no device tree node for it. It is observed that we get irq 0 assigned to PCI-MSI.

This should be added to your commit message: what guest VM is that and
why should the kernel support it.

Why doesn't it need an IO-APIC and why does the current code need to be
changed just for your guest VM?

What else needs to be changed so that your VM works?

Where is that VM's documentation and why can't that VM be fixed *not* to
need kernel changes? IOW, why can't that VM emulate an IO-APIC like the
others do...

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2023-03-14 10:24:46

by Saurabh Singh Sengar

[permalink] [raw]
Subject: RE: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq



> -----Original Message-----
> From: Borislav Petkov <[email protected]>
> Sent: Monday, March 13, 2023 4:44 PM
> To: Saurabh Singh Sengar <[email protected]>
> Cc: Saurabh Sengar <[email protected]>; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> Michael Kelley (LINUX) <[email protected]>; linux-
> [email protected]; [email protected];
> [email protected]
> Subject: Re: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq
>
> On Mon, Mar 13, 2023 at 03:29:32AM +0000, Saurabh Singh Sengar wrote:
> > To be specific in our system which is a guest VM we don't need IO-APIC
> > and hence there is no device tree node for it. It is observed that we get irq 0
> assigned to PCI-MSI.
>
> This should be added to your commit message: what guest VM is that and
> why should the kernel support it.

Guest VM is a linux VM running as child partition on Hyper-V. Hyper-v Linux
documentation is in Documentation/virt/hyperv/.

In commit I wanted to mention that any system which is not registering IO-APIC
will have this issue. But I am fine to mention specifically about the issue I am facing.
As part of your next comment, I have explained the issue in detail if that is good, I
can put that as commit message.

>
> Why doesn't it need an IO-APIC and why does the current code need to be
> changed just for your guest VM?

For Hyper-V Virtual Machines, few platforms don't have any devices to be
hooked to IO-APIC. Although it has Hyper-V based MSI over VMBus which
assigns interrupts to PCIe devices. In such platforms IO-APIC is not
registered which causes gsi_top value to remain at 0 and not get properly
assigned. Moreover, due to the inability to disable CONFIG_X86_IO_APIC
flag, the io-apic code still gets compiled. Thus, arch_dynirq_lower_bound
function in io_apic.c decides the lower bound of irq numbers based on gsi_top.

Later when PCIe-MSI attempts to allocate interrupts, it gets 0 as the first
virq number because gsi_top is still 0. 0 being invalid virq is ignored by
MSI irq domain and results allocation of the same PCIe MSI twice.

CPU0 CPU1
0: 2 0 Hyper-V PCIe MSI 1073741824-edge
1: 69 0 Hyper-V PCIe MSI 1073741824-edge nvme0q0

To avoid this issue, if IO-APIC and gsi_top are not initialized, return the
hint value passed as 'from' value to arch_dynirq_lower_bound instead of 0.
This will also be identical to the behaviour of weak arch_dynirq_lower_bound
function defined in kernel/softirq.c.

>
> What else needs to be changed so that your VM works?

This is the only change required.

>
> Where is that VM's documentation and why can't that VM be fixed *not* to
> need kernel changes? IOW, why can't that VM emulate an IO-APIC like the
> others do...

Documentation is mentioned above. As there is no need of IO-APIC there is
no need emulating it.

Please let me know if there is any further clarification required.

>
> --
> Regards/Gruss,
> Boris.
>
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeopl
> e.kernel.org%2Ftglx%2Fnotes-about-
> netiquette&data=05%7C01%7Cssengar%40microsoft.com%7C817c78e7bb324
> 8cd73b708db23b41c2a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%
> 7C638143028755917117%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjA
> wMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C
> %7C&sdata=3N5Mkl2gjMPHKOJGykZ3LvM6h%2FfD86dXLTQo3VH0Svc%3D&re
> served=0

2023-03-24 07:12:12

by Saurabh Singh Sengar

[permalink] [raw]
Subject: Re: [PATCH] x86/ioapic: Don't return 0 as valid virq



> -----Original Message-----
> From: Saurabh Singh Sengar <[email protected]>
> Sent: Tuesday, March 14, 2023 3:54 PM
> To: Borislav Petkov <[email protected]>
> Cc: Saurabh Sengar <[email protected]>; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> Michael Kelley (LINUX) <[email protected]>; linux-
> [email protected]; [email protected];
> [email protected]
> Subject: RE: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq
>
>
>
> > -----Original Message-----
> > From: Borislav Petkov <[email protected]>
> > Sent: Monday, March 13, 2023 4:44 PM
> > To: Saurabh Singh Sengar <[email protected]>
> > Cc: Saurabh Sengar <[email protected]>; [email protected];
> > [email protected]; [email protected]; [email protected];
> > [email protected]; [email protected]; [email protected];
> > Michael Kelley (LINUX) <[email protected]>; linux-
> > [email protected]; [email protected];
> > [email protected]
> > Subject: Re: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq
> >
> > On Mon, Mar 13, 2023 at 03:29:32AM +0000, Saurabh Singh Sengar wrote:
> > > To be specific in our system which is a guest VM we don't need IO-APIC
> > > and hence there is no device tree node for it. It is observed that we get irq
> 0
> > assigned to PCI-MSI.
> >
> > This should be added to your commit message: what guest VM is that and
> > why should the kernel support it.
>
> Guest VM is a linux VM running as child partition on Hyper-V. Hyper-v Linux
> documentation is in Documentation/virt/hyperv/.
>
> In commit I wanted to mention that any system which is not registering IO-
> APIC
> will have this issue. But I am fine to mention specifically about the issue I am
> facing.
> As part of your next comment, I have explained the issue in detail if that is
> good, I
> can put that as commit message.
>
> >
> > Why doesn't it need an IO-APIC and why does the current code need to be
> > changed just for your guest VM?
>
> For Hyper-V Virtual Machines, few platforms don't have any devices to be
> hooked to IO-APIC. Although it has Hyper-V based MSI over VMBus which
> assigns interrupts to PCIe devices. In such platforms IO-APIC is not
> registered which causes gsi_top value to remain at 0 and not get properly
> assigned. Moreover, due to the inability to disable CONFIG_X86_IO_APIC
> flag, the io-apic code still gets compiled. Thus, arch_dynirq_lower_bound
> function in io_apic.c decides the lower bound of irq numbers based on
> gsi_top.
>
> Later when PCIe-MSI attempts to allocate interrupts, it gets 0 as the first
> virq number because gsi_top is still 0. 0 being invalid virq is ignored by
> MSI irq domain and results allocation of the same PCIe MSI twice.
>
> CPU0 CPU1
> 0: 2 0 Hyper-V PCIe MSI
> 1073741824-edge
> 1: 69 0 Hyper-V PCIe MSI
> 1073741824-edge nvme0q0
>
> To avoid this issue, if IO-APIC and gsi_top are not initialized, return the
> hint value passed as 'from' value to arch_dynirq_lower_bound instead of 0.
> This will also be identical to the behaviour of weak arch_dynirq_lower_bound
> function defined in kernel/softirq.c.

Hi Borislav,

Have you had an opportunity to review the above commit message ?
Kindly provide me with your opinion on whether it meets your expectations.

Regards,
Saurabh

>
> >
> > What else needs to be changed so that your VM works?
>
> This is the only change required.
>
> >
> > Where is that VM's documentation and why can't that VM be fixed *not* to
> > need kernel changes? IOW, why can't that VM emulate an IO-APIC like the
> > others do...
>
> Documentation is mentioned above. As there is no need of IO-APIC there is
> no need emulating it.
>
> Please let me know if there is any further clarification required.
>
> >
> > --
> > Regards/Gruss,
> > Boris.
> >
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeopl
> %2F&data=05%7C01%7Cssengar%40microsoft.com%7C84362c605bf04e6d56
> 6a08db247630d0%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63
> 8143862345546032%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMD
> AiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
> &sdata=7iCmUXeDyD%2Fqv%2BO63N6HG%2FPrS9HWP3yaGClT7X2RB0c%3D
> &reserved=0
> > e.kernel.org%2Ftglx%2Fnotes-about-
> >
> netiquette&data=05%7C01%7Cssengar%40microsoft.com%7C817c78e7bb324
> >
> 8cd73b708db23b41c2a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%
> >
> 7C638143028755917117%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjA
> >
> wMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C
> >
> %7C&sdata=3N5Mkl2gjMPHKOJGykZ3LvM6h%2FfD86dXLTQo3VH0Svc%3D&re
> > served=0

2023-03-24 15:41:32

by Thomas Gleixner

[permalink] [raw]
Subject: RE: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq

On Tue, Mar 14 2023 at 10:23, Saurabh Singh Sengar wrote:
>> This should be added to your commit message: what guest VM is that and
>> why should the kernel support it.
>
> Guest VM is a linux VM running as child partition on Hyper-V. Hyper-v Linux
> documentation is in Documentation/virt/hyperv/.
>
> In commit I wanted to mention that any system which is not registering
> IO-APIC will have this issue. But I am fine to mention specifically
> about the issue I am facing. As part of your next comment, I have
> explained the issue in detail if that is good, I can put that as
> commit message.
>>
>> Why doesn't it need an IO-APIC and why does the current code need to be
>> changed just for your guest VM?
>
> For Hyper-V Virtual Machines, few platforms don't have any devices to be
> hooked to IO-APIC. Although it has Hyper-V based MSI over VMBus which
> assigns interrupts to PCIe devices. In such platforms IO-APIC is not
> registered which causes gsi_top value to remain at 0 and not get properly
> assigned. Moreover, due to the inability to disable CONFIG_X86_IO_APIC
> flag, the io-apic code still gets compiled. Thus, arch_dynirq_lower_bound
> function in io_apic.c decides the lower bound of irq numbers based on gsi_top.
>
> Later when PCIe-MSI attempts to allocate interrupts, it gets 0 as the first
> virq number because gsi_top is still 0. 0 being invalid virq is ignored by
> MSI irq domain and results allocation of the same PCIe MSI twice.
>
> CPU0 CPU1
> 0: 2 0 Hyper-V PCIe MSI 1073741824-edge
> 1: 69 0 Hyper-V PCIe MSI 1073741824-edge nvme0q0
>
> To avoid this issue, if IO-APIC and gsi_top are not initialized, return the
> hint value passed as 'from' value to arch_dynirq_lower_bound instead of 0.
> This will also be identical to the behaviour of weak arch_dynirq_lower_bound
> function defined in kernel/softirq.c.

I find this mightly confusing. Something like this perhaps:

Subject: x86/ioapic: Don't return 0 from arch_dynirq_lower_bound()

arch_dynirq_lower_bound() is invoked by the core interrupt code to
retrieve the lowest possible Linux interrupt number for dynamically
allocated interrupts like MSI.

The x86 implementation uses this to exclude the IO/APIC GSI space.
This works correctly as long as there is an IO/APIC registered, but
returns 0 if not. This has been observed in VMs where the BIOS does
not advertise an IO/APIC.

0 is an invalid interrupt number except for the legacy timer interrupt
on x86. The return value is unchecked in the core code, so it ends up
to allocate interrupt number 0 which is subsequently considered to be
invalid by the caller, e.g. the MSI allocation code.

The function has already a check for 0 in the case that an IO/APIC is
registered, but ioapic_dynirq_base is 0 in case of device tree setups.

Consolidate this and zero check for both ioapic_dynirq_base and gsi_top,
which is used in the case that no IO/APIC is registered.

And then make the code to look like the below, which makes it very
clear what this is about.

Thanks,

tglx
---
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2477,17 +2477,22 @@ static int io_apic_get_redir_entries(int

unsigned int arch_dynirq_lower_bound(unsigned int from)
{
+ unsigned int ret;
+
/*
* dmar_alloc_hwirq() may be called before setup_IO_APIC(), so use
* gsi_top if ioapic_dynirq_base hasn't been initialized yet.
*/
- if (!ioapic_initialized)
- return gsi_top;
+ ret = ioapic_dynirq_base ? : gsi_top;
+
/*
- * For DT enabled machines ioapic_dynirq_base is irrelevant and not
- * updated. So simply return @from if ioapic_dynirq_base == 0.
+ * For DT enabled machines ioapic_dynirq_base is irrelevant and
+ * always 0. gsi_top can be 0 if there is no IO/APIC registered.
+ *
+ * 0 is an invalid interrupt number for dynamic allocations. Return
+ * @from instead.
*/
- return ioapic_dynirq_base ? : from;
+ return ret ? : from;
}

#ifdef CONFIG_X86_32


2023-03-24 16:32:01

by Saurabh Sengar

[permalink] [raw]
Subject: Re: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq

On Fri, Mar 24, 2023 at 04:39:07PM +0100, Thomas Gleixner wrote:
> On Tue, Mar 14 2023 at 10:23, Saurabh Singh Sengar wrote:
> >> This should be added to your commit message: what guest VM is that and
> >> why should the kernel support it.
> >
> > Guest VM is a linux VM running as child partition on Hyper-V. Hyper-v Linux
> > documentation is in Documentation/virt/hyperv/.
> >
> > In commit I wanted to mention that any system which is not registering
> > IO-APIC will have this issue. But I am fine to mention specifically
> > about the issue I am facing. As part of your next comment, I have
> > explained the issue in detail if that is good, I can put that as
> > commit message.
> >>
> >> Why doesn't it need an IO-APIC and why does the current code need to be
> >> changed just for your guest VM?
> >
> > For Hyper-V Virtual Machines, few platforms don't have any devices to be
> > hooked to IO-APIC. Although it has Hyper-V based MSI over VMBus which
> > assigns interrupts to PCIe devices. In such platforms IO-APIC is not
> > registered which causes gsi_top value to remain at 0 and not get properly
> > assigned. Moreover, due to the inability to disable CONFIG_X86_IO_APIC
> > flag, the io-apic code still gets compiled. Thus, arch_dynirq_lower_bound
> > function in io_apic.c decides the lower bound of irq numbers based on gsi_top.
> >
> > Later when PCIe-MSI attempts to allocate interrupts, it gets 0 as the first
> > virq number because gsi_top is still 0. 0 being invalid virq is ignored by
> > MSI irq domain and results allocation of the same PCIe MSI twice.
> >
> > CPU0 CPU1
> > 0: 2 0 Hyper-V PCIe MSI 1073741824-edge
> > 1: 69 0 Hyper-V PCIe MSI 1073741824-edge nvme0q0
> >
> > To avoid this issue, if IO-APIC and gsi_top are not initialized, return the
> > hint value passed as 'from' value to arch_dynirq_lower_bound instead of 0.
> > This will also be identical to the behaviour of weak arch_dynirq_lower_bound
> > function defined in kernel/softirq.c.
>
> I find this mightly confusing. Something like this perhaps:
>
> Subject: x86/ioapic: Don't return 0 from arch_dynirq_lower_bound()
>
> arch_dynirq_lower_bound() is invoked by the core interrupt code to
> retrieve the lowest possible Linux interrupt number for dynamically
> allocated interrupts like MSI.
>
> The x86 implementation uses this to exclude the IO/APIC GSI space.
> This works correctly as long as there is an IO/APIC registered, but
> returns 0 if not. This has been observed in VMs where the BIOS does
> not advertise an IO/APIC.
>
> 0 is an invalid interrupt number except for the legacy timer interrupt
> on x86. The return value is unchecked in the core code, so it ends up
> to allocate interrupt number 0 which is subsequently considered to be
> invalid by the caller, e.g. the MSI allocation code.
>
> The function has already a check for 0 in the case that an IO/APIC is
> registered, but ioapic_dynirq_base is 0 in case of device tree setups.
>
> Consolidate this and zero check for both ioapic_dynirq_base and gsi_top,
> which is used in the case that no IO/APIC is registered.
>
> And then make the code to look like the below, which makes it very
> clear what this is about.
>
> Thanks,
>
> tglx
> ---
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -2477,17 +2477,22 @@ static int io_apic_get_redir_entries(int
>
> unsigned int arch_dynirq_lower_bound(unsigned int from)
> {
> + unsigned int ret;
> +
> /*
> * dmar_alloc_hwirq() may be called before setup_IO_APIC(), so use
> * gsi_top if ioapic_dynirq_base hasn't been initialized yet.
> */
> - if (!ioapic_initialized)
> - return gsi_top;
> + ret = ioapic_dynirq_base ? : gsi_top;
> +
> /*
> - * For DT enabled machines ioapic_dynirq_base is irrelevant and not
> - * updated. So simply return @from if ioapic_dynirq_base == 0.
> + * For DT enabled machines ioapic_dynirq_base is irrelevant and
> + * always 0. gsi_top can be 0 if there is no IO/APIC registered.
> + *
> + * 0 is an invalid interrupt number for dynamic allocations. Return
> + * @from instead.
> */
> - return ioapic_dynirq_base ? : from;
> + return ret ? : from;
> }
>
> #ifdef CONFIG_X86_32
>

Thanks you for your valuable suggestions. Commit message and code looks
much better now. I will send the V2 with your "Co-Developed-by" tag, I
hope its fine with you.

Regards,
Saurabh