2021-06-25 10:18:27

by Daniel Wagner

[permalink] [raw]
Subject: [PATCH 2/2] nvme-fc: Wait with a timeout for queue to freeze

Do not wait indifinitly for all queues to freeze. Instead use a
timeout and abort the operation if we get stuck.

Signed-off-by: Daniel Wagner <[email protected]>
---
drivers/nvme/host/fc.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index a9645cd89eca..d8db85aa5417 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2955,7 +2955,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
dev_info(ctrl->ctrl.device,
"reconnect: revising io queue count from %d to %d\n",
prior_ioq_cnt, nr_io_queues);
- nvme_wait_freeze(&ctrl->ctrl);
+ if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
+ /*
+ * If we timed out waiting for freeze we are likely to
+ * be stuck. Fail the controller initialization just
+ * to be safe.
+ */
+ return -ENODEV;
+ }
blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
nvme_unfreeze(&ctrl->ctrl);
}
--
2.29.2


2021-06-27 14:11:48

by James Smart

[permalink] [raw]
Subject: Re: [PATCH 2/2] nvme-fc: Wait with a timeout for queue to freeze

On 6/25/2021 3:16 AM, Daniel Wagner wrote:
> Do not wait indifinitly for all queues to freeze. Instead use a
> timeout and abort the operation if we get stuck.
>
> Signed-off-by: Daniel Wagner <[email protected]>
> ---
> drivers/nvme/host/fc.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index a9645cd89eca..d8db85aa5417 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -2955,7 +2955,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
> dev_info(ctrl->ctrl.device,
> "reconnect: revising io queue count from %d to %d\n",
> prior_ioq_cnt, nr_io_queues);
> - nvme_wait_freeze(&ctrl->ctrl);
> + if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
> + /*
> + * If we timed out waiting for freeze we are likely to
> + * be stuck. Fail the controller initialization just
> + * to be safe.
> + */
> + return -ENODEV;
> + }
> blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
> nvme_unfreeze(&ctrl->ctrl);
> }
>

Looks fine. This is one of those that changed in the other transports
but fc wasn't part of the patch set.

Reviewed-by: James Smart <[email protected]>

-- james

2021-06-29 01:40:35

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH 2/2] nvme-fc: Wait with a timeout for queue to freeze

On Fri, Jun 25, 2021 at 12:16:49PM +0200, Daniel Wagner wrote:
> Do not wait indifinitly for all queues to freeze. Instead use a
> timeout and abort the operation if we get stuck.
>
> Signed-off-by: Daniel Wagner <[email protected]>
> ---
> drivers/nvme/host/fc.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index a9645cd89eca..d8db85aa5417 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -2955,7 +2955,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
> dev_info(ctrl->ctrl.device,
> "reconnect: revising io queue count from %d to %d\n",
> prior_ioq_cnt, nr_io_queues);
> - nvme_wait_freeze(&ctrl->ctrl);
> + if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
> + /*
> + * If we timed out waiting for freeze we are likely to
> + * be stuck. Fail the controller initialization just
> + * to be safe.
> + */
> + return -ENODEV;
> + }
> blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
> nvme_unfreeze(&ctrl->ctrl);

Can you investigate a bit on why there is the hang? FC shouldn't use
managed IRQ, so the interrupt won't be shutdown.

blk-mq debugfs may help to dump the requests after the hang is triggered,
or you still can add debug code in nvme_wait_freeze_timeout() to dump
all requests if blk-mq debugfs doesn't work.


Thanks,
Ming

2021-06-29 07:49:54

by Daniel Wagner

[permalink] [raw]
Subject: Re: [PATCH 2/2] nvme-fc: Wait with a timeout for queue to freeze

On Tue, Jun 29, 2021 at 09:39:30AM +0800, Ming Lei wrote:
> On Fri, Jun 25, 2021 at 12:16:49PM +0200, Daniel Wagner wrote:
> > Do not wait indifinitly for all queues to freeze. Instead use a
> > timeout and abort the operation if we get stuck.
> >
> > Signed-off-by: Daniel Wagner <[email protected]>
> > ---
> > drivers/nvme/host/fc.c | 9 ++++++++-
> > 1 file changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> > index a9645cd89eca..d8db85aa5417 100644
> > --- a/drivers/nvme/host/fc.c
> > +++ b/drivers/nvme/host/fc.c
> > @@ -2955,7 +2955,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
> > dev_info(ctrl->ctrl.device,
> > "reconnect: revising io queue count from %d to %d\n",
> > prior_ioq_cnt, nr_io_queues);
> > - nvme_wait_freeze(&ctrl->ctrl);
> > + if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
> > + /*
> > + * If we timed out waiting for freeze we are likely to
> > + * be stuck. Fail the controller initialization just
> > + * to be safe.
> > + */
> > + return -ENODEV;
> > + }
> > blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
> > nvme_unfreeze(&ctrl->ctrl);
>
> Can you investigate a bit on why there is the hang? FC shouldn't use
> managed IRQ, so the interrupt won't be shutdown.
>
> blk-mq debugfs may help to dump the requests after the hang is triggered,
> or you still can add debug code in nvme_wait_freeze_timeout() to dump
> all requests if blk-mq debugfs doesn't work.

Sure thing, I'll try to find out why it hangs. The good thing is that I
was able to reliable reproduce it. So let's see.

2021-06-29 12:36:47

by Hannes Reinecke

[permalink] [raw]
Subject: Re: [PATCH 2/2] nvme-fc: Wait with a timeout for queue to freeze

On 6/25/21 12:16 PM, Daniel Wagner wrote:
> Do not wait indifinitly for all queues to freeze. Instead use a
> timeout and abort the operation if we get stuck.
>
> Signed-off-by: Daniel Wagner <[email protected]>
> ---
> drivers/nvme/host/fc.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index a9645cd89eca..d8db85aa5417 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -2955,7 +2955,14 @@ nvme_fc_recreate_io_queues(struct nvme_fc_ctrl *ctrl)
> dev_info(ctrl->ctrl.device,
> "reconnect: revising io queue count from %d to %d\n",
> prior_ioq_cnt, nr_io_queues);
> - nvme_wait_freeze(&ctrl->ctrl);
> + if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) {
> + /*
> + * If we timed out waiting for freeze we are likely to
> + * be stuck. Fail the controller initialization just
> + * to be safe.
> + */
> + return -ENODEV;
> + }
> blk_mq_update_nr_hw_queues(&ctrl->tag_set, nr_io_queues);
> nvme_unfreeze(&ctrl->ctrl);
> }
>
Reviewed-by: Hannes Reinecke <[email protected]>

Cheers,

Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

2021-07-05 16:35:26

by Daniel Wagner

[permalink] [raw]
Subject: Re: [PATCH 2/2] nvme-fc: Wait with a timeout for queue to freeze

On Tue, Jun 29, 2021 at 09:39:30AM +0800, Ming Lei wrote:
> Can you investigate a bit on why there is the hang? FC shouldn't use
> managed IRQ, so the interrupt won't be shutdown.

So far, I was not able to figure out why this hangs. In my test setup I
don't have to do any I/O, I just toggle the remote port.

grep busy /sys/kernel/debug/block/*/hctx*/tags | grep -v busy=0

and this seems to confirm, no I/O in flight.

So I started to look at the q_usage_counter. The obvious observational
is that counter is not 0. The least bit is set, thus we are in atomic
mode.

(gdb) p/x *((struct request_queue*)0xffff8ac992fbef20)->q_usage_counter->data
$10 = {
count = {
counter = 0x8000000000000001
},
release = 0xffffffffa02e78b0,
confirm_switch = 0x0,
force_atomic = 0x0,
allow_reinit = 0x1,
rcu = {
next = 0x0,
func = 0x0
},
ref = 0xffff8ac992fbef30
}

I am a bit confused about the percpu-refcount API. My naive
interpretation is that when we are in atomic mode percpu_ref_is_zero()
can't be used. But this seems rather strange. I must miss something.

2021-07-06 07:30:37

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH 2/2] nvme-fc: Wait with a timeout for queue to freeze

On Mon, Jul 05, 2021 at 06:34:00PM +0200, Daniel Wagner wrote:
> On Tue, Jun 29, 2021 at 09:39:30AM +0800, Ming Lei wrote:
> > Can you investigate a bit on why there is the hang? FC shouldn't use
> > managed IRQ, so the interrupt won't be shutdown.
>
> So far, I was not able to figure out why this hangs. In my test setup I
> don't have to do any I/O, I just toggle the remote port.
>
> grep busy /sys/kernel/debug/block/*/hctx*/tags | grep -v busy=0
>
> and this seems to confirm, no I/O in flight.

What is the output of the following command after the hang is triggered?

(cd /sys/kernel/debug/block/nvme0n1 && find . -type f -exec grep -aH . {} \;)

Suppose the hang disk is nvme0n1.

>
> So I started to look at the q_usage_counter. The obvious observational
> is that counter is not 0. The least bit is set, thus we are in atomic
> mode.
>
> (gdb) p/x *((struct request_queue*)0xffff8ac992fbef20)->q_usage_counter->data
> $10 = {
> count = {
> counter = 0x8000000000000001
> },
> release = 0xffffffffa02e78b0,
> confirm_switch = 0x0,
> force_atomic = 0x0,
> allow_reinit = 0x1,
> rcu = {
> next = 0x0,
> func = 0x0
> },
> ref = 0xffff8ac992fbef30
> }
>
> I am a bit confused about the percpu-refcount API. My naive
> interpretation is that when we are in atomic mode percpu_ref_is_zero()
> can't be used. But this seems rather strange. I must miss something.

No, percpu_ref_is_zero() is fine to be called in atomic mode.


Thanks,
Ming

2021-07-06 08:14:11

by Daniel Wagner

[permalink] [raw]
Subject: Re: [PATCH 2/2] nvme-fc: Wait with a timeout for queue to freeze

On Tue, Jul 06, 2021 at 03:29:11PM +0800, Ming Lei wrote:
> > and this seems to confirm, no I/O in flight.
>
> What is the output of the following command after the hang is triggered?
>
> (cd /sys/kernel/debug/block/nvme0n1 && find . -type f -exec grep -aH . {} \;)
>
> Suppose the hang disk is nvme0n1.

see attachement

> No, percpu_ref_is_zero() is fine to be called in atomic mode.

Okay, that is what I hoped for :)


Attachments:
(No filename) (438.00 B)
blk-debug.txt (17.94 kB)
Download all attachments

2021-07-06 08:46:38

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH 2/2] nvme-fc: Wait with a timeout for queue to freeze

On Tue, Jul 06, 2021 at 10:10:10AM +0200, Daniel Wagner wrote:
> On Tue, Jul 06, 2021 at 03:29:11PM +0800, Ming Lei wrote:
> > > and this seems to confirm, no I/O in flight.
> >
> > What is the output of the following command after the hang is triggered?
> >
> > (cd /sys/kernel/debug/block/nvme0n1 && find . -type f -exec grep -aH . {} \;)
> >
> > Suppose the hang disk is nvme0n1.
>
> see attachement
>
> > No, percpu_ref_is_zero() is fine to be called in atomic mode.
>
> Okay, that is what I hoped for :)

> /sys/kernel/debug/block/nvme0c0n1# find . -type f -exec grep -aH . {} \;

It is the mpath device's debugfs, what is output for the nvmef's
debugfs?


Thanks,
Ming

2021-07-06 09:02:26

by Daniel Wagner

[permalink] [raw]
Subject: Re: [PATCH 2/2] nvme-fc: Wait with a timeout for queue to freeze

On Tue, Jul 06, 2021 at 04:45:30PM +0800, Ming Lei wrote:
> > /sys/kernel/debug/block/nvme0c0n1# find . -type f -exec grep -aH . {} \;
>
> It is the mpath device's debugfs, what is output for the nvmef's
> debugfs?

Do you mean /sys/kernel/debug/block/{nvme0n1,nvme0n2}? These are
directories are empty.

There is only /sys/class/nvme/nvme0, but I don't think this is what you
are asking for.


Attachments:
(No filename) (405.00 B)
blk-debug2.txt (8.53 kB)
Download all attachments

2021-07-06 12:29:04

by Daniel Wagner

[permalink] [raw]
Subject: Re: [PATCH 2/2] nvme-fc: Wait with a timeout for queue to freeze

A nvme_start_freeze() before nvme_wait_freeze() fixes the hangers. It is this
simple?

2021-07-07 02:48:14

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH 2/2] nvme-fc: Wait with a timeout for queue to freeze

On Tue, Jul 06, 2021 at 02:21:21PM +0200, Daniel Wagner wrote:
> A nvme_start_freeze() before nvme_wait_freeze() fixes the hangers. It is this
> simple?

Yeah, it can be the issue, also nvme_start_freeze() has to be paired
with nvme_unfreeze().


Thanks,
Ming