2020-12-18 06:04:42

by Zenghui Yu

[permalink] [raw]
Subject: [PATCH] genirq/msi: Initialize msi_alloc_info to zero for msi_prepare API

Since commit 5fe71d271df8 ("irqchip/gic-v3-its: Tag ITS device as shared if
allocating for a proxy device"), some of the devices are wrongly marked as
"shared" by the ITS driver on systems equipped with the ITS(es). The
problem is that the @info->flags may not be initialized anywhere and we end
up looking at random bits on the stack. That's obviously not good.

The straightforward fix is to properly initialize msi_alloc_info inside the
.prepare callback of affected MSI domains (its-pci-msi, its-platform-msi,
etc). We can also perform the initialization in IRQ core layer for
msi_domain_prepare_irqs() API and it looks much neater to me.

Signed-off-by: Zenghui Yu <[email protected]>
---

This was noticed when I was playing with the assigned devices on arm64 and
VFIO failed to enable MSI-X vectors for almost all VFs (CCed kvm list in
case others will hit the same issue). It turned out that these VFs are
marked as "shared" by mistake and have trouble with the following sequence:

pci_alloc_irq_vectors(pdev, 1, 1, flag);
pci_free_irq_vectors(pdev);
pci_alloc_irq_vectors(pdev, 1, 2, flag); --> we can only get
*one* vector

But besides VFIO, I guess there are already some devices get into trouble
at probe time and can't work properly.

kernel/irq/msi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index 2c0c4d6d0f83..dc0e2d7fbdfd 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -402,7 +402,7 @@ int __msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
struct msi_domain_ops *ops = info->ops;
struct irq_data *irq_data;
struct msi_desc *desc;
- msi_alloc_info_t arg;
+ msi_alloc_info_t arg = { };
int i, ret, virq;
bool can_reserve;

--
2.19.1


2020-12-18 17:42:43

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH] genirq/msi: Initialize msi_alloc_info to zero for msi_prepare API

Hi Zenghui,

On Fri, 18 Dec 2020 06:00:39 +0000,
Zenghui Yu <[email protected]> wrote:
>
> Since commit 5fe71d271df8 ("irqchip/gic-v3-its: Tag ITS device as shared if
> allocating for a proxy device"), some of the devices are wrongly marked as
> "shared" by the ITS driver on systems equipped with the ITS(es). The
> problem is that the @info->flags may not be initialized anywhere and we end
> up looking at random bits on the stack. That's obviously not good.
>
> The straightforward fix is to properly initialize msi_alloc_info inside the
> .prepare callback of affected MSI domains (its-pci-msi, its-platform-msi,
> etc). We can also perform the initialization in IRQ core layer for
> msi_domain_prepare_irqs() API and it looks much neater to me.
>
> Signed-off-by: Zenghui Yu <[email protected]>
> ---
>
> This was noticed when I was playing with the assigned devices on arm64 and
> VFIO failed to enable MSI-X vectors for almost all VFs (CCed kvm list in
> case others will hit the same issue). It turned out that these VFs are
> marked as "shared" by mistake and have trouble with the following sequence:
>
> pci_alloc_irq_vectors(pdev, 1, 1, flag);
> pci_free_irq_vectors(pdev);
> pci_alloc_irq_vectors(pdev, 1, 2, flag); --> we can only get
> *one* vector
>
> But besides VFIO, I guess there are already some devices get into trouble
> at probe time and can't work properly.
>
> kernel/irq/msi.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
> index 2c0c4d6d0f83..dc0e2d7fbdfd 100644
> --- a/kernel/irq/msi.c
> +++ b/kernel/irq/msi.c
> @@ -402,7 +402,7 @@ int __msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
> struct msi_domain_ops *ops = info->ops;
> struct irq_data *irq_data;
> struct msi_desc *desc;
> - msi_alloc_info_t arg;
> + msi_alloc_info_t arg = { };
> int i, ret, virq;
> bool can_reserve;

Thanks for having investigated this. I guess my only worry with this
is that msi_alloc_info_t is a pretty large structure on x86, and
zeroing it isn't totally free. But this definitely looks nicer than
some of the alternatives (.prepare isn't a good option, as we do rely
on the flag being set in __platform_msi_create_device_domain(), which
calls itself .prepare).

I'll queue it, and we can always revisit this later if Thomas (or
anyone else) has a better idea.

Thanks,

M.

--
Without deviation from the norm, progress is not possible.

Subject: [irqchip: irq/irqchip-next] genirq/msi: Initialize msi_alloc_info before calling msi_domain_prepare_irqs()

The following commit has been merged into the irq/irqchip-next branch of irqchip:

Commit-ID: 06fde695ee76429634c1e8c8c1154035aa61191e
Gitweb: https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms/06fde695ee76429634c1e8c8c1154035aa61191e
Author: Zenghui Yu <[email protected]>
AuthorDate: Fri, 18 Dec 2020 14:00:39 +08:00
Committer: Marc Zyngier <[email protected]>
CommitterDate: Fri, 18 Dec 2020 17:42:18

genirq/msi: Initialize msi_alloc_info before calling msi_domain_prepare_irqs()

Since commit 5fe71d271df8 ("irqchip/gic-v3-its: Tag ITS device as shared if
allocating for a proxy device"), some of the devices are wrongly marked as
"shared" by the ITS driver on systems equipped with the ITS(es). The
problem is that the @info->flags may not be initialized anywhere and we end
up looking at random bits on the stack. That's obviously not good.

We can perform the initialization in the IRQ core layer before calling
msi_domain_prepare_irqs(), which is neat enough.

Fixes: 5fe71d271df8 ("irqchip/gic-v3-its: Tag ITS device as shared if allocating for a proxy device")
Signed-off-by: Zenghui Yu <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
kernel/irq/msi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index 2c0c4d6..dc0e2d7 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -402,7 +402,7 @@ int __msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
struct msi_domain_ops *ops = info->ops;
struct irq_data *irq_data;
struct msi_desc *desc;
- msi_alloc_info_t arg;
+ msi_alloc_info_t arg = { };
int i, ret, virq;
bool can_reserve;

2020-12-21 04:51:30

by Zenghui Yu

[permalink] [raw]
Subject: Re: [PATCH] genirq/msi: Initialize msi_alloc_info to zero for msi_prepare API

Hi Marc,

On 2020/12/19 1:38, Marc Zyngier wrote:
> Hi Zenghui,
>
> On Fri, 18 Dec 2020 06:00:39 +0000,
> Zenghui Yu <[email protected]> wrote:
>>
>> Since commit 5fe71d271df8 ("irqchip/gic-v3-its: Tag ITS device as shared if
>> allocating for a proxy device"), some of the devices are wrongly marked as
>> "shared" by the ITS driver on systems equipped with the ITS(es). The
>> problem is that the @info->flags may not be initialized anywhere and we end
>> up looking at random bits on the stack. That's obviously not good.
>>
>> The straightforward fix is to properly initialize msi_alloc_info inside the
>> .prepare callback of affected MSI domains (its-pci-msi, its-platform-msi,
>> etc). We can also perform the initialization in IRQ core layer for
>> msi_domain_prepare_irqs() API and it looks much neater to me.
>>
>> Signed-off-by: Zenghui Yu <[email protected]>
>> ---
>>
>> This was noticed when I was playing with the assigned devices on arm64 and
>> VFIO failed to enable MSI-X vectors for almost all VFs (CCed kvm list in
>> case others will hit the same issue). It turned out that these VFs are
>> marked as "shared" by mistake and have trouble with the following sequence:
>>
>> pci_alloc_irq_vectors(pdev, 1, 1, flag);
>> pci_free_irq_vectors(pdev);
>> pci_alloc_irq_vectors(pdev, 1, 2, flag); --> we can only get
>> *one* vector
>>
>> But besides VFIO, I guess there are already some devices get into trouble
>> at probe time and can't work properly.
>>
>> kernel/irq/msi.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
>> index 2c0c4d6d0f83..dc0e2d7fbdfd 100644
>> --- a/kernel/irq/msi.c
>> +++ b/kernel/irq/msi.c
>> @@ -402,7 +402,7 @@ int __msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
>> struct msi_domain_ops *ops = info->ops;
>> struct irq_data *irq_data;
>> struct msi_desc *desc;
>> - msi_alloc_info_t arg;
>> + msi_alloc_info_t arg = { };
>> int i, ret, virq;
>> bool can_reserve;
>
> Thanks for having investigated this. I guess my only worry with this
> is that msi_alloc_info_t is a pretty large structure on x86, and
> zeroing it isn't totally free.

It seems that x86 will zero the whole msi_alloc_info_t structure at the
beginning of its .prepare (pci_msi_prepare()/init_irq_alloc_info()). If
this really affects something, I think we can easily address it with
some cleanup (on top of this patch).

> But this definitely looks nicer than
> some of the alternatives (.prepare isn't a good option, as we do rely
> on the flag being set in __platform_msi_create_device_domain(), which
> calls itself .prepare).

Indeed, thanks for fixing the commit message.

> I'll queue it, and we can always revisit this later if Thomas (or
> anyone else) has a better idea.

Thanks!


Zenghui