2024-04-22 16:29:47

by Sean Anderson

[permalink] [raw]
Subject: [PATCH] nvme-pci: Add quirk for broken MSIs

Sandisk SN530 NVMe drives have broken MSIs. On systems without MSI-X
support, all commands time out resulting in the following message:

nvme nvme0: I/O tag 12 (100c) QID 0 timeout, completion polled

These timeouts cause the boot to take an excessively-long time (over 20
minutes) while the initial command queue is flushed.

Address this by adding a quirk for drives with buggy MSIs. The lspci
output for this device (recorded on a system with MSI-X support) is:

02:00.0 Non-Volatile memory controller: Sandisk Corp Device 5008 (rev 01) (prog-if 02 [NVM Express])
Subsystem: Sandisk Corp Device 5008
Flags: bus master, fast devsel, latency 0, IRQ 16, NUMA node 0
Memory at f7e00000 (64-bit, non-prefetchable) [size=16K]
Memory at f7e04000 (64-bit, non-prefetchable) [size=256]
Capabilities: [80] Power Management version 3
Capabilities: [90] MSI: Enable- Count=1/32 Maskable- 64bit+
Capabilities: [b0] MSI-X: Enable+ Count=17 Masked-
Capabilities: [c0] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting
Capabilities: [150] Device Serial Number 00-00-00-00-00-00-00-00
Capabilities: [1b8] Latency Tolerance Reporting
Capabilities: [300] Secondary PCI Express
Capabilities: [900] L1 PM Substates
Kernel driver in use: nvme
Kernel modules: nvme

Cc: <[email protected]>
Signed-off-by: Sean Anderson <[email protected]>
---

drivers/nvme/host/nvme.h | 5 +++++
drivers/nvme/host/pci.c | 14 +++++++++++---
2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 7b87763e2f8a..738719085e05 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -162,6 +162,11 @@ enum nvme_quirks {
* Disables simple suspend/resume path.
*/
NVME_QUIRK_FORCE_NO_SIMPLE_SUSPEND = (1 << 20),
+
+ /*
+ * MSI (but not MSI-X) interrupts are broken and never fire.
+ */
+ NVME_QUIRK_BROKEN_MSI = (1 << 21),
};

/*
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index e6267a6aa380..d1f0fa2e7c96 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2218,6 +2218,7 @@ static int nvme_setup_irqs(struct nvme_dev *dev, unsigned int nr_io_queues)
.priv = dev,
};
unsigned int irq_queues, poll_queues;
+ unsigned int flags = PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY;

/*
* Poll queues don't need interrupts, but we need at least one I/O queue
@@ -2241,8 +2242,10 @@ static int nvme_setup_irqs(struct nvme_dev *dev, unsigned int nr_io_queues)
irq_queues = 1;
if (!(dev->ctrl.quirks & NVME_QUIRK_SINGLE_VECTOR))
irq_queues += (nr_io_queues - poll_queues);
- return pci_alloc_irq_vectors_affinity(pdev, 1, irq_queues,
- PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY, &affd);
+ if (dev->ctrl.quirks & NVME_QUIRK_BROKEN_MSI)
+ flags &= ~PCI_IRQ_MSI;
+ return pci_alloc_irq_vectors_affinity(pdev, 1, irq_queues, flags,
+ &affd);
}

static unsigned int nvme_max_io_queues(struct nvme_dev *dev)
@@ -2471,6 +2474,7 @@ static int nvme_pci_enable(struct nvme_dev *dev)
{
int result = -ENOMEM;
struct pci_dev *pdev = to_pci_dev(dev->dev);
+ unsigned int flags = PCI_IRQ_ALL_TYPES;

if (pci_enable_device_mem(pdev))
return result;
@@ -2487,7 +2491,9 @@ static int nvme_pci_enable(struct nvme_dev *dev)
* interrupts. Pre-enable a single MSIX or MSI vec for setup. We'll
* adjust this later.
*/
- result = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_ALL_TYPES);
+ if (dev->ctrl.quirks & NVME_QUIRK_BROKEN_MSI)
+ flags &= ~PCI_IRQ_MSI;
+ result = pci_alloc_irq_vectors(pdev, 1, 1, flags);
if (result < 0)
goto disable;

@@ -3381,6 +3387,8 @@ static const struct pci_device_id nvme_id_table[] = {
.driver_data = NVME_QUIRK_DELAY_BEFORE_CHK_RDY |
NVME_QUIRK_DISABLE_WRITE_ZEROES|
NVME_QUIRK_IGNORE_DEV_SUBNQN, },
+ { PCI_DEVICE(0x15b7, 0x5008), /* Sandisk SN530 */
+ .driver_data = NVME_QUIRK_BROKEN_MSI },
{ PCI_DEVICE(0x1987, 0x5012), /* Phison E12 */
.driver_data = NVME_QUIRK_BOGUS_NID, },
{ PCI_DEVICE(0x1987, 0x5016), /* Phison E16 */
--
2.35.1.1320.gc452695387.dirty



2024-04-22 16:54:12

by Keith Busch

[permalink] [raw]
Subject: Re: [PATCH] nvme-pci: Add quirk for broken MSIs

On Mon, Apr 22, 2024 at 12:28:23PM -0400, Sean Anderson wrote:
> Sandisk SN530 NVMe drives have broken MSIs. On systems without MSI-X
> support, all commands time out resulting in the following message:
>
> nvme nvme0: I/O tag 12 (100c) QID 0 timeout, completion polled
>
> These timeouts cause the boot to take an excessively-long time (over 20
> minutes) while the initial command queue is flushed.
>
> Address this by adding a quirk for drives with buggy MSIs. The lspci
> output for this device (recorded on a system with MSI-X support) is:

Based on your description, the patch looks good. This will fallback to
legacy emulated pin interrupts, and that's better than timeout polling,
but will still appear sluggish compared to MSI's. Is there an errata
from the vendor on this? I'm just curious if the bug is at the Device ID
level, and not something we could constrain to a particular model or
firmware revision.

> 02:00.0 Non-Volatile memory controller: Sandisk Corp Device 5008 (rev 01) (prog-if 02 [NVM Express])
> Subsystem: Sandisk Corp Device 5008
> Flags: bus master, fast devsel, latency 0, IRQ 16, NUMA node 0
> Memory at f7e00000 (64-bit, non-prefetchable) [size=16K]
> Memory at f7e04000 (64-bit, non-prefetchable) [size=256]
> Capabilities: [80] Power Management version 3
> Capabilities: [90] MSI: Enable- Count=1/32 Maskable- 64bit+
> Capabilities: [b0] MSI-X: Enable+ Count=17 Masked-

Interesting, the MSI capability does look weird here. I've never seen
MSI-x count smaller than the MSI's. As long as both work, though, I
think nvme would actually prefer whichever is bigger!

2024-04-22 17:15:53

by Sean Anderson

[permalink] [raw]
Subject: Re: [PATCH] nvme-pci: Add quirk for broken MSIs

On 4/22/24 12:49, Keith Busch wrote:
> On Mon, Apr 22, 2024 at 12:28:23PM -0400, Sean Anderson wrote:
>> Sandisk SN530 NVMe drives have broken MSIs. On systems without MSI-X
>> support, all commands time out resulting in the following message:
>>
>> nvme nvme0: I/O tag 12 (100c) QID 0 timeout, completion polled
>>
>> These timeouts cause the boot to take an excessively-long time (over 20
>> minutes) while the initial command queue is flushed.
>>
>> Address this by adding a quirk for drives with buggy MSIs. The lspci
>> output for this device (recorded on a system with MSI-X support) is:
>
> Based on your description, the patch looks good. This will fallback to
> legacy emulated pin interrupts, and that's better than timeout polling,
> but will still appear sluggish compared to MSI's. Is there an errata
> from the vendor on this? I'm just curious if the bug is at the Device ID
> level, and not something we could constrain to a particular model or
> firmware revision.

I wasn't able to find any errata for this drive. I wasn't able to
determine if there are any firmware updates for this drive (FWIW I have
version "21160001"). I'll contact WD and see if they know about this
issue.

[1] https://www.westerndigital.com/products/internal-drives/pc-sn530-ssd

>> 02:00.0 Non-Volatile memory controller: Sandisk Corp Device 5008 (rev 01) (prog-if 02 [NVM Express])
>> Subsystem: Sandisk Corp Device 5008
>> Flags: bus master, fast devsel, latency 0, IRQ 16, NUMA node 0
>> Memory at f7e00000 (64-bit, non-prefetchable) [size=16K]
>> Memory at f7e04000 (64-bit, non-prefetchable) [size=256]
>> Capabilities: [80] Power Management version 3
>> Capabilities: [90] MSI: Enable- Count=1/32 Maskable- 64bit+
>> Capabilities: [b0] MSI-X: Enable+ Count=17 Masked-
>
> Interesting, the MSI capability does look weird here. I've never seen
> MSI-x count smaller than the MSI's. As long as both work, though, I
> think nvme would actually prefer whichever is bigger!

--Sean

2024-05-03 19:32:57

by Sean Anderson

[permalink] [raw]
Subject: Re: [PATCH] nvme-pci: Add quirk for broken MSIs

On 4/22/24 13:15, Sean Anderson wrote:
> On 4/22/24 12:49, Keith Busch wrote:
>> On Mon, Apr 22, 2024 at 12:28:23PM -0400, Sean Anderson wrote:
>>> Sandisk SN530 NVMe drives have broken MSIs. On systems without MSI-X
>>> support, all commands time out resulting in the following message:
>>>
>>> nvme nvme0: I/O tag 12 (100c) QID 0 timeout, completion polled
>>>
>>> These timeouts cause the boot to take an excessively-long time (over 20
>>> minutes) while the initial command queue is flushed.
>>>
>>> Address this by adding a quirk for drives with buggy MSIs. The lspci
>>> output for this device (recorded on a system with MSI-X support) is:
>>
>> Based on your description, the patch looks good. This will fallback to
>> legacy emulated pin interrupts, and that's better than timeout polling,
>> but will still appear sluggish compared to MSI's. Is there an errata
>> from the vendor on this? I'm just curious if the bug is at the Device ID
>> level, and not something we could constrain to a particular model or
>> firmware revision.
>
> I wasn't able to find any errata for this drive. I wasn't able to
> determine if there are any firmware updates for this drive (FWIW I have
> version "21160001"). I'll contact WD and see if they know about this
> issue.

Well, the response from WD support was "we don't support Linux, and if
we did there aren't any bugs in the drive anyway".

--Sean

2024-05-07 13:57:25

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] nvme-pci: Add quirk for broken MSIs

Looks good:

Reviewed-by: Christoph Hellwig <[email protected]>

2024-05-07 15:03:11

by Keith Busch

[permalink] [raw]
Subject: Re: [PATCH] nvme-pci: Add quirk for broken MSIs

On Mon, Apr 22, 2024 at 12:28:23PM -0400, Sean Anderson wrote:
> Sandisk SN530 NVMe drives have broken MSIs. On systems without MSI-X
> support, all commands time out resulting in the following message:
>
> nvme nvme0: I/O tag 12 (100c) QID 0 timeout, completion polled
>
> These timeouts cause the boot to take an excessively-long time (over 20
> minutes) while the initial command queue is flushed.
>
> Address this by adding a quirk for drives with buggy MSIs. The lspci
> output for this device (recorded on a system with MSI-X support) is:
>
> 02:00.0 Non-Volatile memory controller: Sandisk Corp Device 5008 (rev 01) (prog-if 02 [NVM Express])
> Subsystem: Sandisk Corp Device 5008
> Flags: bus master, fast devsel, latency 0, IRQ 16, NUMA node 0
> Memory at f7e00000 (64-bit, non-prefetchable) [size=16K]
> Memory at f7e04000 (64-bit, non-prefetchable) [size=256]
> Capabilities: [80] Power Management version 3
> Capabilities: [90] MSI: Enable- Count=1/32 Maskable- 64bit+
> Capabilities: [b0] MSI-X: Enable+ Count=17 Masked-
> Capabilities: [c0] Express Endpoint, MSI 00
> Capabilities: [100] Advanced Error Reporting
> Capabilities: [150] Device Serial Number 00-00-00-00-00-00-00-00
> Capabilities: [1b8] Latency Tolerance Reporting
> Capabilities: [300] Secondary PCI Express
> Capabilities: [900] L1 PM Substates
> Kernel driver in use: nvme
> Kernel modules: nvme
>
> Cc: <[email protected]>
> Signed-off-by: Sean Anderson <[email protected]>

Thanks, applied to nvme-6.9.