2021-08-02 11:28:47

by Daniel Wagner

[permalink] [raw]
Subject: [PATCH RESEND v4 0/8] Handle update hardware queues and queue freeze more carefully

[forgot to add the linux-nvme mailing list]

Hi,

This update version makes sure the unfreeze call is done when
recreating the queues. I was able to reproduce hanging I/Os when we go
into error recovery mode for FC and TCP [1]. Unfortunatly, I don't
have access to a RDMA setup to verify but as the code is identically
to the TCP, RDMA looks to like to suffer from the same problem.

Thanks,
Daniel

[1] https://lore.kernel.org/linux-nvme/[email protected]/
https://lore.kernel.org/linux-nvme/[email protected]/


v1:
- https://lore.kernel.org/linux-nvme/[email protected]/
v2:
- https://lore.kernel.org/linux-nvme/[email protected]/
- reviewed tags collected
- added 'update hardware queues' for all transport
- added fix for fc hanger in nvme_wait_freeze_timeout
v3:
- https://lore.kernel.org/linux-nvme/[email protected]/
- dropped 'nvme-fc: Freeze queues before destroying them'
- added James' two patches
v4:
- added 'nvme-*: Unfreeze queues on reconnect'
- added Hannes' reviewed tags


Daniel Wagner (5):
nvme-fc: Update hardware queues before using them
nvme-rdma: Update number of hardware queues before using them
nvme-fc: Wait with a timeout for queue to freeze
nvme-tcp: Unfreeze queues on reconnect
nvme-rdma: Unfreeze queues on reconnect

Hannes Reinecke (1):
nvme-tcp: Update number of hardware queues before using them

James Smart (2):
nvme-fc: avoid race between time out and tear down
nvme-fc: fix controller reset hang during traffic

drivers/nvme/host/fc.c | 28 +++++++++++++++++++---------
drivers/nvme/host/rdma.c | 15 ++++++++-------
drivers/nvme/host/tcp.c | 18 +++++++++---------
3 files changed, 36 insertions(+), 25 deletions(-)

--
2.29.2



2021-08-02 11:28:48

by Daniel Wagner

[permalink] [raw]
Subject: [PATCH v4 3/8] nvme-rdma: Update number of hardware queues before using them

When the number of hardware queues changes during resetting we should
update the tagset first before using it.

Reviewed-by: Hannes Reinecke <[email protected]>
Signed-off-by: Daniel Wagner <[email protected]>
---
drivers/nvme/host/rdma.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 69ae67652f38..de2a8950d282 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -965,6 +965,7 @@ static void nvme_rdma_destroy_io_queues(struct nvme_rdma_ctrl *ctrl,
static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
{
int ret;
+ u32 prior_q_cnt = ctrl->ctrl.queue_count;

ret = nvme_rdma_alloc_io_queues(ctrl);
if (ret)
@@ -982,13 +983,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
ret = PTR_ERR(ctrl->ctrl.connect_q);
goto out_free_tag_set;
}
- }
-
- ret = nvme_rdma_start_io_queues(ctrl);
- if (ret)
- goto out_cleanup_connect_q;
-
- if (!new) {
+ } else if (prior_q_cnt != ctrl->ctrl.queue_count) {
nvme_start_queues(&ctrl->ctrl);
if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
/*
@@ -1004,6 +999,10 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
nvme_unfreeze(&ctrl->ctrl);
}

+ ret = nvme_rdma_start_io_queues(ctrl);
+ if (ret)
+ goto out_cleanup_connect_q;
+
return 0;

out_wait_freeze_timed_out:
--
2.29.2


2021-08-02 11:28:57

by Daniel Wagner

[permalink] [raw]
Subject: [PATCH v4 5/8] nvme-fc: avoid race between time out and tear down

From: James Smart <[email protected]>

To avoid race between time out and tear down, in tear down process,
first we quiesce the queue, and then delete the timer and cancel
the time out work for the queue.

This patch merges the admin and io sync ops into the queue teardown logic
as shown in the RDMA patch 3017013dcc "nvme-rdma: avoid race between time
out and tear down". There is no teardown_lock in nvme-fc.

Signed-off-by: James Smart <[email protected]>
CC: Chao Leng <[email protected]>
Tested-by: Daniel Wagner <[email protected]>
[dwagner: updated commit id referenced in commit message]
Reviewed-by: Daniel Wagner <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
---
drivers/nvme/host/fc.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index dbb8ad816df8..133b87db4f1d 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2487,6 +2487,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
*/
if (ctrl->ctrl.queue_count > 1) {
nvme_stop_queues(&ctrl->ctrl);
+ nvme_sync_io_queues(&ctrl->ctrl);
blk_mq_tagset_busy_iter(&ctrl->tag_set,
nvme_fc_terminate_exchange, &ctrl->ctrl);
blk_mq_tagset_wait_completed_request(&ctrl->tag_set);
@@ -2510,6 +2511,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
* clean up the admin queue. Same thing as above.
*/
blk_mq_quiesce_queue(ctrl->ctrl.admin_q);
+ blk_sync_queue(ctrl->ctrl.admin_q);
blk_mq_tagset_busy_iter(&ctrl->admin_tag_set,
nvme_fc_terminate_exchange, &ctrl->ctrl);
blk_mq_tagset_wait_completed_request(&ctrl->admin_tag_set);
--
2.29.2


2021-08-02 11:29:10

by Daniel Wagner

[permalink] [raw]
Subject: [PATCH v4 8/8] nvme-rdma: Unfreeze queues on reconnect

During the queue teardown in nvme_rdma_teardown_io_queues() freeze is
called unconditionally. When we reconnect we need to pair the freeze
with an unfreeze to avoid hanging I/Os. For newly created connection
this is not needed.

Fixes: 9f98772ba307 ("nvme-rdma: fix controller reset hang during traffic")
Signed-off-by: Daniel Wagner <[email protected]>
---
drivers/nvme/host/rdma.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index de2a8950d282..21a8a5353af0 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -901,6 +901,8 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl,
error = PTR_ERR(ctrl->ctrl.admin_q);
goto out_cleanup_fabrics_q;
}
+ } else {
+ nvme_unfreeze(&ctrl->ctrl);
}

error = nvme_rdma_start_queue(ctrl, 0);
--
2.29.2


2021-08-02 11:30:10

by Daniel Wagner

[permalink] [raw]
Subject: [PATCH v4 4/8] nvme-fc: Wait with a timeout for queue to freeze

Do not wait indifinitly for all queues to freeze. Instead use a
timeout and abort the operation if we get stuck.

Reviewed-by: James Smart <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Signed-off-by: Daniel Wagner <[email protected]>
---
drivers/nvme/host/fc.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 8a903769364f..dbb8ad816df8 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2955,7 +2955,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
dev_info(ctrl->ctrl.device,
"reconnect: revising io queue count from %d to %d\n",
prior_ioq_cnt, nr_io_queues);
- nvme_wait_freeze(&ctrl->ctrl);
+ if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
+ /*
+ * If we timed out waiting for freeze we are likely to
+ * be stuck. Fail the controller initialization just
+ * to be safe.
+ */
+ return -ENODEV;
+ }
blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
nvme_unfreeze(&ctrl->ctrl);
}
--
2.29.2


2021-08-02 11:31:49

by Daniel Wagner

[permalink] [raw]
Subject: [PATCH v4 7/8] nvme-tcp: Unfreeze queues on reconnect

During the queue teardown in nvme_tcp_teardown_io_queues() freeze is
called unconditionally. When we reconnect we need to pair the freeze
with an unfreeze to avoid hanging I/Os. For newly created connection
this is not needed.

Fixes: 2875b0aecabe ("nvme-tcp: fix controller reset hang during traffic")
Signed-off-by: Daniel Wagner <[email protected]>
---
drivers/nvme/host/tcp.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 32268f24f62a..097f7dd10ed3 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1819,9 +1819,11 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
}
blk_mq_update_nr_hw_queues(ctrl->tagset,
ctrl->queue_count - 1);
- nvme_unfreeze(ctrl);
}

+ if (!new)
+ nvme_unfreeze(ctrl);
+
ret = nvme_tcp_start_io_queues(ctrl);
if (ret)
goto out_cleanup_connect_q;
--
2.29.2


2021-08-02 19:38:00

by Himanshu Madhani

[permalink] [raw]
Subject: Re: [PATCH v4 4/8] nvme-fc: Wait with a timeout for queue to freeze



On 8/2/21 6:26 AM, Daniel Wagner wrote:
> Do not wait indifinitly for all queues to freeze. Instead use a
> timeout and abort the operation if we get stuck.
>
> Reviewed-by: James Smart <[email protected]>
> Reviewed-by: Hannes Reinecke <[email protected]>
> Signed-off-by: Daniel Wagner <[email protected]>
> ---
> drivers/nvme/host/fc.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index 8a903769364f..dbb8ad816df8 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -2955,7 +2955,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
> dev_info(ctrl->ctrl.device,
> "reconnect: revising io queue count from %d to %d\n",
> prior_ioq_cnt, nr_io_queues);
> - nvme_wait_freeze(&ctrl->ctrl);
> + if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
> + /*
> + * If we timed out waiting for freeze we are likely to
> + * be stuck. Fail the controller initialization just
> + * to be safe.
> + */
> + return -ENODEV;
> + }
> blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
> nvme_unfreeze(&ctrl->ctrl);
> }
>

Reviewed-by: Himanshu Madhani <[email protected]>

--
Himanshu Madhani Oracle Linux Engineering

2021-08-02 19:40:17

by Himanshu Madhani

[permalink] [raw]
Subject: Re: [PATCH v4 5/8] nvme-fc: avoid race between time out and tear down



On 8/2/21 6:26 AM, Daniel Wagner wrote:
> From: James Smart <[email protected]>
>
> To avoid race between time out and tear down, in tear down process,
> first we quiesce the queue, and then delete the timer and cancel
> the time out work for the queue.
>
> This patch merges the admin and io sync ops into the queue teardown logic
> as shown in the RDMA patch 3017013dcc "nvme-rdma: avoid race between time
> out and tear down". There is no teardown_lock in nvme-fc.
>
> Signed-off-by: James Smart <[email protected]>
> CC: Chao Leng <[email protected]>
> Tested-by: Daniel Wagner <[email protected]>
> [dwagner: updated commit id referenced in commit message]
> Reviewed-by: Daniel Wagner <[email protected]>
> Reviewed-by: Hannes Reinecke <[email protected]>
> ---
> drivers/nvme/host/fc.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index dbb8ad816df8..133b87db4f1d 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -2487,6 +2487,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
> */
> if (ctrl->ctrl.queue_count > 1) {
> nvme_stop_queues(&ctrl->ctrl);
> + nvme_sync_io_queues(&ctrl->ctrl);
> blk_mq_tagset_busy_iter(&ctrl->tag_set,
> nvme_fc_terminate_exchange, &ctrl->ctrl);
> blk_mq_tagset_wait_completed_request(&ctrl->tag_set);
> @@ -2510,6 +2511,7 @@ __nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues)
> * clean up the admin queue. Same thing as above.
> */
> blk_mq_quiesce_queue(ctrl->ctrl.admin_q);
> + blk_sync_queue(ctrl->ctrl.admin_q);
> blk_mq_tagset_busy_iter(&ctrl->admin_tag_set,
> nvme_fc_terminate_exchange, &ctrl->ctrl);
> blk_mq_tagset_wait_completed_request(&ctrl->admin_tag_set);
>

Reviewed-by: Himanshu Madhani <[email protected]>

--
Himanshu Madhani Oracle Linux Engineering

2021-08-04 07:29:38

by Hannes Reinecke

[permalink] [raw]
Subject: Re: [PATCH v4 8/8] nvme-rdma: Unfreeze queues on reconnect

On 8/2/21 1:26 PM, Daniel Wagner wrote:
> During the queue teardown in nvme_rdma_teardown_io_queues() freeze is
> called unconditionally. When we reconnect we need to pair the freeze
> with an unfreeze to avoid hanging I/Os. For newly created connection
> this is not needed.
>
> Fixes: 9f98772ba307 ("nvme-rdma: fix controller reset hang during traffic")
> Signed-off-by: Daniel Wagner <[email protected]>
> ---
> drivers/nvme/host/rdma.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index de2a8950d282..21a8a5353af0 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -901,6 +901,8 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl,
> error = PTR_ERR(ctrl->ctrl.admin_q);
> goto out_cleanup_fabrics_q;
> }
> + } else {
> + nvme_unfreeze(&ctrl->ctrl);
> }
>
> error = nvme_rdma_start_queue(ctrl, 0);
>
Reviewed-by: Hannes Reinecke <[email protected]>

Cheers,

Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

2021-08-07 00:10:19

by Sagi Grimberg

[permalink] [raw]
Subject: Re: [PATCH v4 8/8] nvme-rdma: Unfreeze queues on reconnect


> During the queue teardown in nvme_rdma_teardown_io_queues() freeze is
> called unconditionally. When we reconnect we need to pair the freeze
> with an unfreeze to avoid hanging I/Os. For newly created connection
> this is not needed.
>
> Fixes: 9f98772ba307 ("nvme-rdma: fix controller reset hang during traffic")
> Signed-off-by: Daniel Wagner <[email protected]>
> ---
> drivers/nvme/host/rdma.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> index de2a8950d282..21a8a5353af0 100644
> --- a/drivers/nvme/host/rdma.c
> +++ b/drivers/nvme/host/rdma.c
> @@ -901,6 +901,8 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl,
> error = PTR_ERR(ctrl->ctrl.admin_q);
> goto out_cleanup_fabrics_q;
> }
> + } else {
> + nvme_unfreeze(&ctrl->ctrl);

That seems misplaced.. unfreezing the I/O queues when setting up the
admin queue?

2021-08-09 09:00:05

by Daniel Wagner

[permalink] [raw]
Subject: Re: [PATCH v4 8/8] nvme-rdma: Unfreeze queues on reconnect

Hi Sagi,

On Fri, Aug 06, 2021 at 12:59:15PM -0700, Sagi Grimberg wrote:
>
> > During the queue teardown in nvme_rdma_teardown_io_queues() freeze is
> > called unconditionally. When we reconnect we need to pair the freeze
> > with an unfreeze to avoid hanging I/Os. For newly created connection
> > this is not needed.
> >
> > Fixes: 9f98772ba307 ("nvme-rdma: fix controller reset hang during traffic")
> > Signed-off-by: Daniel Wagner <[email protected]>
> > ---
> > drivers/nvme/host/rdma.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
> > index de2a8950d282..21a8a5353af0 100644
> > --- a/drivers/nvme/host/rdma.c
> > +++ b/drivers/nvme/host/rdma.c
> > @@ -901,6 +901,8 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl,
> > error = PTR_ERR(ctrl->ctrl.admin_q);
> > goto out_cleanup_fabrics_q;
> > }
> > + } else {
> > + nvme_unfreeze(&ctrl->ctrl);
>
> That seems misplaced.. unfreezing the I/O queues when setting up the admin
> queue?

Indeed. After looking again on it, this should be almost identically to
the tcp.c fix in nvme_rdma_configure_io_queues.

BTW, I am working on getting a RDMA test setup running. Hopefully I have
all the right licenses on the array.

Daniel