2024-01-10 18:29:10

by William Butler

[permalink] [raw]
Subject: [PATCH] nvme: Inform controller of doorbell config before unquiescing IO queues

During resets, if queues are unquiesced first, then the host can submit
IOs to the controller using shadow doorbell logic but the controller
won't be aware. This can lead to necessary MMIO doorbells from being
not issued, causing requests to be delayed and timed-out.

Signed-off-by: William Butler <[email protected]>
---
drivers/nvme/host/pci.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 61af7ff1a9d6..f87c51a946b3 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2743,10 +2743,10 @@ static void nvme_reset_work(struct work_struct *work)
* controller around but remove all namespaces.
*/
if (dev->online_queues > 1) {
+ nvme_dbbuf_set(dev);
nvme_unquiesce_io_queues(&dev->ctrl);
nvme_wait_freeze(&dev->ctrl);
nvme_pci_update_nr_queues(dev);
- nvme_dbbuf_set(dev);
nvme_unfreeze(&dev->ctrl);
} else {
dev_warn(dev->ctrl.device, "IO queues lost\n");
--
2.43.0.275.g3460e3d667-goog



2024-01-10 18:40:38

by Keith Busch

[permalink] [raw]
Subject: Re: [PATCH] nvme: Inform controller of doorbell config before unquiescing IO queues

On Wed, Jan 10, 2024 at 06:28:55PM +0000, William Butler wrote:
> During resets, if queues are unquiesced first, then the host can submit
> IOs to the controller using shadow doorbell logic but the controller
> won't be aware. This can lead to necessary MMIO doorbells from being
> not issued, causing requests to be delayed and timed-out.

Good catch. I guess we wouldn't normally expect any requests to be in
queue at this point, but plenty of reasons it could happen.

Applied to nvme-6.8 with a slightly updated title to match the local
conventions.