2021-06-09 18:50:34

by Heiner Kallweit

[permalink] [raw]
Subject: linux-next: NVME using PCI legacy interrupts only

I found that on linux-next from June 8th my nvme disk is using legacy
interrupts only. Some debugging lead me to irq_find_mapping() in
msi_domain_alloc() returning -EEXIST.

The nvme core first allocates a MSI-X interrupt for setup purposes
and later frees it and allocates the final number of MSI-X interrupts.

The following experimental change brought back the MSI-X interrupts.
This makes me think that somehow freeing a MSI-X interrupt doesn't
free it completely. I didn't see this behavior a few days ago,
therefore I think it's related to the recent changes to
irqdomain/genirq.

Didn't do a bisect yet, maybe you have an idea already.

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index a29b17070..8cc600819 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2381,7 +2381,7 @@ static int nvme_pci_enable(struct nvme_dev *dev)
* interrupts. Pre-enable a single MSIX or MSI vec for setup. We'll
* adjust this later.
*/
- result = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_ALL_TYPES);
+ result = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_LEGACY);
if (result < 0)
return result;



2021-06-10 08:08:51

by Marc Zyngier

[permalink] [raw]
Subject: Re: linux-next: NVME using PCI legacy interrupts only

On Wed, 09 Jun 2021 19:43:57 +0100,
Heiner Kallweit <[email protected]> wrote:
>
> I found that on linux-next from June 8th my nvme disk is using legacy
> interrupts only. Some debugging lead me to irq_find_mapping() in
> msi_domain_alloc() returning -EEXIST.
>
> The nvme core first allocates a MSI-X interrupt for setup purposes
> and later frees it and allocates the final number of MSI-X interrupts.
>
> The following experimental change brought back the MSI-X interrupts.
> This makes me think that somehow freeing a MSI-X interrupt doesn't
> free it completely. I didn't see this behavior a few days ago,
> therefore I think it's related to the recent changes to
> irqdomain/genirq.
>
> Didn't do a bisect yet, maybe you have an idea already.

Yeah, recent changes in the irqdomain subsystem seem to have uncovered
a long standing issue where we are leaving dangling references in some
domains....

I've now dropped the branch from -next while I figure it out.

Thanks,

M.

--
Without deviation from the norm, progress is not possible.