LinuxLists.cc - [PATCH v2 0/5] Handle update hardware queues and queue freeze more carefully

2021-07-08 09:29:28

Subject: [PATCH v2 0/5] Handle update hardware queues and queue freeze more carefully

Hi,

I've tested this on top of Ming's patches 'blk-mq: fix
blk_mq_alloc_request_hctx'[1] which fixes all problems (including the
hanger in nvme_wait_freeze()).

Thanks,
Danie

[1] https://lore.kernel.org/linux-nvme/[email protected]/

v1:
- https://lore.kernel.org/linux-nvme/[email protected]/
v2:
- reviewed tags collected
- added 'update hardware queues' for all transport
- added fix for fc hanger in nvme_wait_freeze_timeout

Initial cover letter:

this is a followup on the crash I reported in

https://lore.kernel.org/linux-block/[email protected]/

By moving the hardware check up the crash was gone. Unfortuntatly, I
don't understand why this fixes the crash. The per-cpu access is
crashing but I can't see why the blk_mq_update_nr_hw_queues() is
fixing this problem.

Even though I can't explain why it fixes it, I think it makes sense to
update the hardware queue mapping bevore we recreate the IO
queues. Thus I avoided in the commit message to say it fixes
something.

Also during testing I observed the we hang indivinetly in
blk_mq_freeze_queue_wait(). Again I can't explain why we get stuck
there but given a common pattern for the nvme_wait_freeze() is to use
it with a timeout I think the timeout should be used too :)

Anyway, someone with more undertanding of the stack can explain the
problems.

Daniel Wagner (4):
nvme-fc: Update hardware queues before using them
nvme-rdma: Update number of hardware queues before using them
nvme-fc: Wait with a timeout for queue to freeze
nvme-fc: Freeze queues before destroying them

Hannes Reinecke (1):
nvme-tcp: Update number of hardware queues before using them

drivers/nvme/host/fc.c | 26 +++++++++++++++++---------
drivers/nvme/host/rdma.c | 13 ++++++-------
drivers/nvme/host/tcp.c | 14 ++++++--------
3 files changed, 29 insertions(+), 24 deletions(-)

--
2.29.2

2021-07-08 09:29:29

by Daniel Wagner

[permalink] [raw]

Subject: [PATCH v2 2/5] nvme-tcp: Update number of hardware queues before using them

From: Hannes Reinecke <[email protected]>

When the number of hardware queues changes during resetting we should
update the tagset first before using it.

Signed-off-by: Hannes Reinecke <[email protected]>
Signed-off-by: Daniel Wagner <[email protected]>
---
drivers/nvme/host/tcp.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 0125463b7d77..a0f15e94c7a6 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1784,6 +1784,7 @@ static void nvme_tcp_destroy_io_queues(struct nvme_ctrl *ctrl, bool remove)
static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
{
int ret;
+ u32 prior_q_cnt = ctrl->queue_count;

ret = nvme_tcp_alloc_io_queues(ctrl);
if (ret)
@@ -1801,14 +1802,7 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
ret = PTR_ERR(ctrl->connect_q);
goto out_free_tag_set;
}
- }
-
- ret = nvme_tcp_start_io_queues(ctrl);
- if (ret)
- goto out_cleanup_connect_q;
-
- if (!new) {
- nvme_start_queues(ctrl);
+ } else if (prior_q_cnt != ctrl->queue_count) {
if (!nvme_wait_freeze_timeout(ctrl, NVME_IO_TIMEOUT)) {
/*
* If we timed out waiting for freeze we are likely to
@@ -1823,6 +1817,10 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
nvme_unfreeze(ctrl);
}

+ ret = nvme_tcp_start_io_queues(ctrl);
+ if (ret)
+ goto out_cleanup_connect_q;
+
return 0;

out_wait_freeze_timed_out:
--
2.29.2

2021-07-08 09:29:48

by Daniel Wagner

[permalink] [raw]

Subject: [PATCH v2 5/5] nvme-fc: Freeze queues before destroying them

nvme_wait_freeze_timeout() in nvme_fc_recreate_io_queues() needs to be
paired with a nvme_start_freeze(). Without freezing first we will always
timeout in nvme_wait_freeze_timeout().

Note there is a similiar fix for RDMA 9f98772ba307 ("nvme-rdma: fix
controller reset hang during traffic") which happens to follow the PCI
strategy how to handle resetting the queues.

Signed-off-by: Daniel Wagner <[email protected]>
---
drivers/nvme/host/fc.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 8e1fc3796735..a38b01485939 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3249,6 +3249,7 @@ nvme_fc_delete_association(struct nvme_fc_ctrl *ctrl)
nvme_fc_xmt_ls_rsp(disls);

if (ctrl->ctrl.tagset) {
+ nvme_start_freeze(&ctrl->ctrl);
nvme_fc_delete_hw_io_queues(ctrl);
nvme_fc_free_io_queues(ctrl);
}
--
2.29.2

2021-07-08 09:30:36

by Daniel Wagner

[permalink] [raw]

Subject: [PATCH v2 3/5] nvme-rdma: Update number of hardware queues before using them

When the number of hardware queues changes during resetting we should
update the tagset first before using it.

Signed-off-by: Daniel Wagner <[email protected]>
---
drivers/nvme/host/rdma.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 3a296fd34bef..9825112bd9f4 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -967,6 +967,7 @@ static void nvme_rdma_destroy_io_queues(struct nvme_rdma_ctrl *ctrl,
static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
{
int ret;
+ u32 prior_q_cnt = ctrl->ctrl.queue_count;

ret = nvme_rdma_alloc_io_queues(ctrl);
if (ret)
@@ -984,13 +985,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
ret = PTR_ERR(ctrl->ctrl.connect_q);
goto out_free_tag_set;
}
- }
-
- ret = nvme_rdma_start_io_queues(ctrl);
- if (ret)
- goto out_cleanup_connect_q;
-
- if (!new) {
+ } else if (prior_q_cnt != ctrl->ctrl.queue_count) {
nvme_start_queues(&ctrl->ctrl);
if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
/*
@@ -1006,6 +1001,10 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
nvme_unfreeze(&ctrl->ctrl);
}

+ ret = nvme_rdma_start_io_queues(ctrl);
+ if (ret)
+ goto out_cleanup_connect_q;
+
return 0;

out_wait_freeze_timed_out:
--
2.29.2

2021-07-08 09:30:37

by Daniel Wagner

[permalink] [raw]

Subject: [PATCH v2 4/5] nvme-fc: Wait with a timeout for queue to freeze

Do not wait indifinitly for all queues to freeze. Instead use a
timeout and abort the operation if we get stuck.

Reviewed-by: James Smart <[email protected]>
Signed-off-by: Daniel Wagner <[email protected]>
---
drivers/nvme/host/fc.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index d0eb81387d4e..8e1fc3796735 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2956,7 +2956,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
dev_info(ctrl->ctrl.device,
"reconnect: revising io queue count from %d to %d\n",
prior_ioq_cnt, nr_io_queues);
- nvme_wait_freeze(&ctrl->ctrl);
+ if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
+ /*
+ * If we timed out waiting for freeze we are likely to
+ * be stuck. Fail the controller initialization just
+ * to be safe.
+ */
+ return -ENODEV;
+ }
blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
nvme_unfreeze(&ctrl->ctrl);
}
--
2.29.2

2021-07-08 09:32:17

by Daniel Wagner

[permalink] [raw]

Subject: [PATCH v2 1/5] nvme-fc: Update hardware queues before using them

In case the number of hardware queues changes, do the update the
tagset and ctx to hctx first before using the mapping to recreate and
connnect the IO queues.

Reviewed-by: James Smart <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
Signed-off-by: Daniel Wagner <[email protected]>
---
drivers/nvme/host/fc.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index c087cf7a6e1f..d0eb81387d4e 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2952,14 +2952,6 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
if (ctrl->ctrl.queue_count == 1)
return 0;

- ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
- if (ret)
- goto out_free_io_queues;
-
- ret = nvme_fc_connect_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
- if (ret)
- goto out_delete_hw_queues;
-
if (prior_ioq_cnt != nr_io_queues) {
dev_info(ctrl->ctrl.device,
"reconnect: revising io queue count from %d to %d\n",
@@ -2969,6 +2961,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
nvme_unfreeze(&ctrl->ctrl);
}

+ ret = nvme_fc_create_hw_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
+ if (ret)
+ goto out_free_io_queues;
+
+ ret = nvme_fc_connect_io_queues(ctrl, ctrl->ctrl.sqsize + 1);
+ if (ret)
+ goto out_delete_hw_queues;
+
return 0;

out_delete_hw_queues:
--
2.29.2

2021-07-08 10:11:09

by Hannes Reinecke

[permalink] [raw]

Subject: Re: [PATCH v2 1/5] nvme-fc: Update hardware queues before using them

On 7/8/21 11:27 AM, Daniel Wagner wrote:
> In case the number of hardware queues changes, do the update the
> tagset and ctx to hctx first before using the mapping to recreate and
> connnect the IO queues.
>
> Reviewed-by: James Smart <[email protected]>
> Reviewed-by: Ming Lei <[email protected]>
> Signed-off-by: Daniel Wagner <[email protected]>
> ---
> drivers/nvme/host/fc.c | 16 ++++++++--------
> 1 file changed, 8 insertions(+), 8 deletions(-)
>
Reviewed-by: Hannes Reinecke <[email protected]>

Cheers,

Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

2021-07-08 10:14:27

by Hannes Reinecke

[permalink] [raw]

Subject: Re: [PATCH v2 4/5] nvme-fc: Wait with a timeout for queue to freeze

On 7/8/21 11:27 AM, Daniel Wagner wrote:
> Do not wait indifinitly for all queues to freeze. Instead use a
> timeout and abort the operation if we get stuck.
>
> Reviewed-by: James Smart <[email protected]>
> Signed-off-by: Daniel Wagner <[email protected]>
> ---
> drivers/nvme/host/fc.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index d0eb81387d4e..8e1fc3796735 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -2956,7 +2956,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
> dev_info(ctrl->ctrl.device,
> "reconnect: revising io queue count from %d to %d\n",
> prior_ioq_cnt, nr_io_queues);
> - nvme_wait_freeze(&ctrl->ctrl);
> + if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
> + /*
> + * If we timed out waiting for freeze we are likely to
> + * be stuck. Fail the controller initialization just
> + * to be safe.
> + */
> + return -ENODEV;

For controller reset we're using '-ENOTCONN'; maybe it's worthwhile to
use the same error code here.
But that's just a minor detail.

> + }
> blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
> nvme_unfreeze(&ctrl->ctrl);
> }
>
Reviewed-by: Hannes Reinecke <[email protected]>

Cheers,

Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

2021-07-08 10:15:10

by Hannes Reinecke

[permalink] [raw]

Subject: Re: [PATCH v2 5/5] nvme-fc: Freeze queues before destroying them

On 7/8/21 11:27 AM, Daniel Wagner wrote:
> nvme_wait_freeze_timeout() in nvme_fc_recreate_io_queues() needs to be
> paired with a nvme_start_freeze(). Without freezing first we will always
> timeout in nvme_wait_freeze_timeout().
>
> Note there is a similiar fix for RDMA 9f98772ba307 ("nvme-rdma: fix
> controller reset hang during traffic") which happens to follow the PCI
> strategy how to handle resetting the queues.
>
> Signed-off-by: Daniel Wagner <[email protected]>
> ---
> drivers/nvme/host/fc.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index 8e1fc3796735..a38b01485939 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -3249,6 +3249,7 @@ nvme_fc_delete_association(struct nvme_fc_ctrl *ctrl)
> nvme_fc_xmt_ls_rsp(disls);
>
> if (ctrl->ctrl.tagset) {
> + nvme_start_freeze(&ctrl->ctrl);
> nvme_fc_delete_hw_io_queues(ctrl);
> nvme_fc_free_io_queues(ctrl);
> }
>
Please add a comment here about the pairing. We've missed it once, so we
should make it clear why it has to be placed here.

Cheers,

Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

2021-07-09 16:14:58

by James Smart

[permalink] [raw]

Subject: Re: [PATCH v2 5/5] nvme-fc: Freeze queues before destroying them

On 7/8/2021 2:27 AM, Daniel Wagner wrote:
> nvme_wait_freeze_timeout() in nvme_fc_recreate_io_queues() needs to be
> paired with a nvme_start_freeze(). Without freezing first we will always
> timeout in nvme_wait_freeze_timeout().
>
> Note there is a similiar fix for RDMA 9f98772ba307 ("nvme-rdma: fix
> controller reset hang during traffic") which happens to follow the PCI
> strategy how to handle resetting the queues.
>
> Signed-off-by: Daniel Wagner <[email protected]>
> ---
> drivers/nvme/host/fc.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index 8e1fc3796735..a38b01485939 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -3249,6 +3249,7 @@ nvme_fc_delete_association(struct nvme_fc_ctrl *ctrl)
> nvme_fc_xmt_ls_rsp(disls);
>
> if (ctrl->ctrl.tagset) {
> + nvme_start_freeze(&ctrl->ctrl);
> nvme_fc_delete_hw_io_queues(ctrl);
> nvme_fc_free_io_queues(ctrl);
> }
>

Thanks for the note. that definitely helped follow what is being
attempted. I also agree with Hannes that the comment from the rdma patch
should also be present to understand what's going on.

Looking at the patch - this is not done in the same place or manner as
rdma. Freezing and stoppage is prior to cancelling and that doesn't
correspond where this was added (this is after all cancellations). We
also seem to be missing a nvme_sync_io_queues() call in the sequence as
well. So I believe there's more work to be done on this patch. I'll see
what I can do.

We really need to see about a common layer for transports. So much we do
is similar. We were ok at the start, but we've drifted apart over time
and the requirements to the core layer aren't propogating to all transports.

-- james

2021-07-09 16:44:05

by Daniel Wagner

[permalink] [raw]

Subject: Re: [PATCH v2 5/5] nvme-fc: Freeze queues before destroying them

Hi James,

On Fri, Jul 09, 2021 at 09:14:07AM -0700, James Smart wrote:
> Thanks for the note. that definitely helped follow what is being attempted.
> I also agree with Hannes that the comment from the rdma patch should also be
> present to understand what's going on.

Sure will do. Though this has to wait until I am back from holiday
though :)

> Looking at the patch - this is not done in the same place or manner as rdma.
> Freezing and stoppage is prior to cancelling and that doesn't correspond
> where this was added (this is after all cancellations). We also seem to be
> missing a nvme_sync_io_queues() call in the sequence as well. So I believe
> there's more work to be done on this patch. I'll see what I can do.

Thanks!

> We really need to see about a common layer for transports. So much we do is
> similar. We were ok at the start, but we've drifted apart over time and the
> requirements to the core layer aren't propogating to all transports.

Yes, makes a lot of sense to me to sync the transports implementation a
bit more.

Thanks,
Daniel