2019-10-03 19:16:22

by Tyler Ramer

[permalink] [raw]
Subject: [PATCH] nvme-pci: Shutdown when removing dead controller

Always shutdown the controller when nvme_remove_dead_controller is
reached.

It's possible for nvme_remove_dead_controller to be called as part of a
failed reset, when there is a bad NVME_CSTS. The controller won't
be comming back online, so we should shut it down rather than just
disabling.

Signed-off-by: Tyler Ramer <[email protected]>
---
drivers/nvme/host/pci.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index c0808f9eb8ab..c3f5ba22c625 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2509,7 +2509,7 @@ static void nvme_pci_free_ctrl(struct nvme_ctrl *ctrl)
static void nvme_remove_dead_ctrl(struct nvme_dev *dev)
{
nvme_get_ctrl(&dev->ctrl);
- nvme_dev_disable(dev, false);
+ nvme_dev_disable(dev, true);
nvme_kill_queues(&dev->ctrl);
if (!queue_work(nvme_wq, &dev->remove_work))
nvme_put_ctrl(&dev->ctrl);
--
2.23.0


2019-10-04 15:38:02

by Tyler Ramer

[permalink] [raw]
Subject: Re: [PATCH] nvme-pci: Shutdown when removing dead controller

Here's a failure we had which represents the issue the patch is
intended to solve:

Aug 26 15:00:56 testhost kernel: nvme nvme4: async event result 00010300
Aug 26 15:01:27 testhost kernel: nvme nvme4: controller is down; will
reset: CSTS=0x3, PCI_STATUS=0x10
Aug 26 15:02:10 testhost kernel: nvme nvme4: Device not ready; aborting reset
Aug 26 15:02:10 testhost kernel: nvme nvme4: Removing after probe
failure status: -19

The CSTS warnings comes from nvme_timeout, and is printed by
nvme_warn_reset. A reset then occurs
Controller state should be NVME_CTRL_RESETTING

Now, in nvme_reset_work, controller is never marked "CONNECTING" at:

if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_CONNECTING))

because several lines above, we can determine that
nvme_pci_configure_admin_queues returns
a bad result, which triggers a goto out_unlock and prints "removing
after probe failure status: -19"

Because state is never changed to NVME_CTRL_CONNECTING or
NVME_CTRL_DELETING, the
logic added in https://github.com/torvalds/linux/commit/2036f7263d70e67d70a67899a468588cb7356bc9
should not apply. We can further validate that dev->ctrl.state ==
NVME_CTRL_RESETTING thanks to
the WARN_ON in nvme_reset_work.






On Thu, Oct 3, 2019 at 3:13 PM Tyler Ramer <[email protected]> wrote:
>
> Always shutdown the controller when nvme_remove_dead_controller is
> reached.
>
> It's possible for nvme_remove_dead_controller to be called as part of a
> failed reset, when there is a bad NVME_CSTS. The controller won't
> be comming back online, so we should shut it down rather than just
> disabling.
>
> Signed-off-by: Tyler Ramer <[email protected]>
> ---
> drivers/nvme/host/pci.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index c0808f9eb8ab..c3f5ba22c625 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -2509,7 +2509,7 @@ static void nvme_pci_free_ctrl(struct nvme_ctrl *ctrl)
> static void nvme_remove_dead_ctrl(struct nvme_dev *dev)
> {
> nvme_get_ctrl(&dev->ctrl);
> - nvme_dev_disable(dev, false);
> + nvme_dev_disable(dev, true);
> nvme_kill_queues(&dev->ctrl);
> if (!queue_work(nvme_wq, &dev->remove_work))
> nvme_put_ctrl(&dev->ctrl);
> --
> 2.23.0
>

2019-10-05 02:14:38

by Singh, Balbir

[permalink] [raw]
Subject: Re: [PATCH] nvme-pci: Shutdown when removing dead controller

On Fri, 2019-10-04 at 11:36 -0400, Tyler Ramer wrote:
> Here's a failure we had which represents the issue the patch is
> intended to solve:
>
> Aug 26 15:00:56 testhost kernel: nvme nvme4: async event result 00010300
> Aug 26 15:01:27 testhost kernel: nvme nvme4: controller is down; will
> reset: CSTS=0x3, PCI_STATUS=0x10
> Aug 26 15:02:10 testhost kernel: nvme nvme4: Device not ready; aborting
> reset
> Aug 26 15:02:10 testhost kernel: nvme nvme4: Removing after probe
> failure status: -19
>
> The CSTS warnings comes from nvme_timeout, and is printed by
> nvme_warn_reset. A reset then occurs
> Controller state should be NVME_CTRL_RESETTING
>
> Now, in nvme_reset_work, controller is never marked "CONNECTING" at:
>
> if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_CONNECTING))
>
> because several lines above, we can determine that
> nvme_pci_configure_admin_queues returns
> a bad result, which triggers a goto out_unlock and prints "removing
> after probe failure status: -19"
>
> Because state is never changed to NVME_CTRL_CONNECTING or
> NVME_CTRL_DELETING, the
> logic added in
> https://github.com/torvalds/linux/commit/2036f7263d70e67d70a67899a468588cb7356bc9
> should not apply. We can further validate that dev->ctrl.state ==
> NVME_CTRL_RESETTING thanks to
> the WARN_ON in nvme_reset_work.
>
>
>
>
>
>
> On Thu, Oct 3, 2019 at 3:13 PM Tyler Ramer <[email protected]> wrote:
> >
> > Always shutdown the controller when nvme_remove_dead_controller is
> > reached.
> >
> > It's possible for nvme_remove_dead_controller to be called as part of a
> > failed reset, when there is a bad NVME_CSTS. The controller won't
> > be comming back online, so we should shut it down rather than just
> > disabling.
> >

What is the bad CSTS bit? CSTS.RDY? The entire reset/disable race is quite
tricky in general, it was made better with the shutdown_lock, but if the
timeout value is small, several of them can occur in the middle of a reset.

For this patch

Acked-by: Balbir Singh <[email protected]>

> > Signed-off-by: Tyler Ramer <[email protected]>
> > ---
> > drivers/nvme/host/pci.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> > index c0808f9eb8ab..c3f5ba22c625 100644
> > --- a/drivers/nvme/host/pci.c
> > +++ b/drivers/nvme/host/pci.c
> > @@ -2509,7 +2509,7 @@ static void nvme_pci_free_ctrl(struct nvme_ctrl
> > *ctrl)
> > static void nvme_remove_dead_ctrl(struct nvme_dev *dev)
> > {
> > nvme_get_ctrl(&dev->ctrl);
> > - nvme_dev_disable(dev, false);
> > + nvme_dev_disable(dev, true);




> > nvme_kill_queues(&dev->ctrl);
> > if (!queue_work(nvme_wq, &dev->remove_work))
> > nvme_put_ctrl(&dev->ctrl);
> > --
> > 2.23.0
> >
>
> _______________________________________________
> Linux-nvme mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-nvme

2019-10-05 22:00:17

by Tyler Ramer

[permalink] [raw]
Subject: Re: [PATCH] nvme-pci: Shutdown when removing dead controller

> What is the bad CSTS bit? CSTS.RDY?

The reset will be triggered by the result of nvme_should_reset():

1196 static bool nvme_should_reset(struct nvme_dev *dev, u32 csts)
1197 {
1198
1199 ⇥ /* If true, indicates loss of adapter communication, possibly by a
1200 ⇥ * NVMe Subsystem reset.
1201 ⇥ */
1202 ⇥ bool nssro = dev->subsystem && (csts & NVME_CSTS_NSSRO);

This csts value is set in nvme_timeout:

1240 static enum blk_eh_timer_return nvme_timeout(struct request *req,
bool reserved)
1241 {
...
1247 ⇥ u32 csts = readl(dev->bar + NVME_REG_CSTS);
...
1256 ⇥ /*
1257 ⇥ * Reset immediately if the controller is failed
1258 ⇥ */
1259 ⇥ if (nvme_should_reset(dev, csts)) {
1260 ⇥ ⇥ nvme_warn_reset(dev, csts);
1261 ⇥ ⇥ nvme_dev_disable(dev, false);
1262 ⇥ ⇥ nvme_reset_ctrl(&dev->ctrl);


Again, here's the message printed by nvme_warn_reset:

Aug 26 15:01:27 testhost kernel: nvme nvme4: controller is down; will
reset: CSTS=0x3, PCI_STATUS=0x10

From include/linux/nvme.h:
105 ⇥ NVME_REG_CSTS⇥ = 0x001c,⇥ /* Controller Status */

- Tyler

2019-10-06 19:22:32

by Keith Busch

[permalink] [raw]
Subject: Re: [PATCH] nvme-pci: Shutdown when removing dead controller

On Fri, Oct 04, 2019 at 11:36:42AM -0400, Tyler Ramer wrote:
> Here's a failure we had which represents the issue the patch is
> intended to solve:
>
> Aug 26 15:00:56 testhost kernel: nvme nvme4: async event result 00010300
> Aug 26 15:01:27 testhost kernel: nvme nvme4: controller is down; will
> reset: CSTS=0x3, PCI_STATUS=0x10
> Aug 26 15:02:10 testhost kernel: nvme nvme4: Device not ready; aborting reset
> Aug 26 15:02:10 testhost kernel: nvme nvme4: Removing after probe
> failure status: -19
>
> The CSTS warnings comes from nvme_timeout, and is printed by
> nvme_warn_reset. A reset then occurs
> Controller state should be NVME_CTRL_RESETTING
>
> Now, in nvme_reset_work, controller is never marked "CONNECTING" at:
>
> if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_CONNECTING))
>
> because several lines above, we can determine that
> nvme_pci_configure_admin_queues returns
> a bad result, which triggers a goto out_unlock and prints "removing
> after probe failure status: -19"
>
> Because state is never changed to NVME_CTRL_CONNECTING or
> NVME_CTRL_DELETING, the
> logic added in https://github.com/torvalds/linux/commit/2036f7263d70e67d70a67899a468588cb7356bc9
> should not apply.

Nor should it, because there are no IO in flight at this point, there
can't be any timeout work to check the state.

> We can further validate that dev->ctrl.state ==
> NVME_CTRL_RESETTING thanks to
> the WARN_ON in nvme_reset_work.

I'm not sure I see what this is fixing. Setting the shutdown to true is
usually just to get the queues flushed, but the nvme_kill_queues() that
we call accomplishes the same thing.

> On Thu, Oct 3, 2019 at 3:13 PM Tyler Ramer <[email protected]> wrote:
> >
> > Always shutdown the controller when nvme_remove_dead_controller is
> > reached.
> >
> > It's possible for nvme_remove_dead_controller to be called as part of a
> > failed reset, when there is a bad NVME_CSTS. The controller won't
> > be comming back online, so we should shut it down rather than just
> > disabling.
> >
> > Signed-off-by: Tyler Ramer <[email protected]>
> > ---
> > drivers/nvme/host/pci.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> > index c0808f9eb8ab..c3f5ba22c625 100644
> > --- a/drivers/nvme/host/pci.c
> > +++ b/drivers/nvme/host/pci.c
> > @@ -2509,7 +2509,7 @@ static void nvme_pci_free_ctrl(struct nvme_ctrl *ctrl)
> > static void nvme_remove_dead_ctrl(struct nvme_dev *dev)
> > {
> > nvme_get_ctrl(&dev->ctrl);
> > - nvme_dev_disable(dev, false);
> > + nvme_dev_disable(dev, true);
> > nvme_kill_queues(&dev->ctrl);
> > if (!queue_work(nvme_wq, &dev->remove_work))
> > nvme_put_ctrl(&dev->ctrl);
> > --
> > 2.23.0
> >

2019-10-07 15:13:03

by Tyler Ramer

[permalink] [raw]
Subject: Re: [PATCH] nvme-pci: Shutdown when removing dead controller

> Setting the shutdown to true is
> usually just to get the queues flushed, but the nvme_kill_queues() that
> we call accomplishes the same thing.

The intention of this patch was to clean up another location where
nvme_dev_disable()
is called with shutdown == false, but the device is being removed due
to a failure
condition, so it should be shutdown.

Perhaps though, given nvme_kill_queues() provides a subset of the
functionality of
nvme_dev_disable() with shutdown == true, we can just use
nvme_dev_disable() and
remove nvme_kill_queues()?

This will make nvme_remove_dead_ctrl() more in line with nvme_remove(),
nvme_shutdown(), etc.

- Tyler

2019-10-07 15:47:31

by Keith Busch

[permalink] [raw]
Subject: Re: [PATCH] nvme-pci: Shutdown when removing dead controller

On Mon, Oct 07, 2019 at 11:13:12AM -0400, Tyler Ramer wrote:
> > Setting the shutdown to true is
> > usually just to get the queues flushed, but the nvme_kill_queues() that
> > we call accomplishes the same thing.
>
> The intention of this patch was to clean up another location where
> nvme_dev_disable()
> is called with shutdown == false, but the device is being removed due
> to a failure
> condition, so it should be shutdown.
>
> Perhaps though, given nvme_kill_queues() provides a subset of the
> functionality of
> nvme_dev_disable() with shutdown == true, we can just use
> nvme_dev_disable() and
> remove nvme_kill_queues()?
>
> This will make nvme_remove_dead_ctrl() more in line with nvme_remove(),
> nvme_shutdown(), etc.

It's fine to use the shutdown == true in this path as well, but I just wanted
to understand what problem it was fixing. It doesn't sound like your scenario
is going to end up setting CC.SHN, so the only thing the shutdown should
accomplish is flushing pending IO, but we already call kill_queues() right
after the nvme_dev_disable(), so that part should be okay.

2019-10-07 17:52:42

by Tyler Ramer

[permalink] [raw]
Subject: [PATCH v2] nvme-pci: Shutdown when removing dead controller

Shutdown the controller when nvme_remove_dead_controller is
reached.

If nvme_remove_dead_controller() is called, the controller won't
be comming back online, so we should shut it down rather than just
disabling.

Remove nvme_kill_queues() as nvme_dev_remove() will take care of
unquiescing queues.

Signed-off-by: Tyler Ramer <[email protected]>

---

Changes since v1:
* Clean up commit message
* Remove nvme_kill_queues()
---
drivers/nvme/host/pci.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index c0808f9eb8ab..68d5fb880d80 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2509,8 +2509,7 @@ static void nvme_pci_free_ctrl(struct nvme_ctrl *ctrl)
static void nvme_remove_dead_ctrl(struct nvme_dev *dev)
{
nvme_get_ctrl(&dev->ctrl);
- nvme_dev_disable(dev, false);
- nvme_kill_queues(&dev->ctrl);
+ nvme_dev_disable(dev, true);
if (!queue_work(nvme_wq, &dev->remove_work))
nvme_put_ctrl(&dev->ctrl);
}
--
2.23.0

2019-10-07 18:29:20

by Keith Busch

[permalink] [raw]
Subject: Re: [PATCH v2] nvme-pci: Shutdown when removing dead controller

On Mon, Oct 07, 2019 at 01:50:11PM -0400, Tyler Ramer wrote:
> Shutdown the controller when nvme_remove_dead_controller is
> reached.
>
> If nvme_remove_dead_controller() is called, the controller won't
> be comming back online, so we should shut it down rather than just
> disabling.
>
> Remove nvme_kill_queues() as nvme_dev_remove() will take care of
> unquiescing queues.


We do still need to kill the queues, though. The shutdown == true just flushes
all pending requests. Killing queues does that too, but it also sets the
request_queue to dying, which will terminate syncing any dirty pages.

> ---
>
> Changes since v1:
> * Clean up commit message
> * Remove nvme_kill_queues()
> ---
> drivers/nvme/host/pci.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index c0808f9eb8ab..68d5fb880d80 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -2509,8 +2509,7 @@ static void nvme_pci_free_ctrl(struct nvme_ctrl *ctrl)
> static void nvme_remove_dead_ctrl(struct nvme_dev *dev)
> {
> nvme_get_ctrl(&dev->ctrl);
> - nvme_dev_disable(dev, false);
> - nvme_kill_queues(&dev->ctrl);
> + nvme_dev_disable(dev, true);
> if (!queue_work(nvme_wq, &dev->remove_work))
> nvme_put_ctrl(&dev->ctrl);
> }
> --
> 2.23.0
>

2019-10-07 19:34:32

by Tyler Ramer

[permalink] [raw]
Subject: Re: [PATCH v2] nvme-pci: Shutdown when removing dead controller

Keith,

Thanks for clarifying. I appreciate the comments.

2019-10-07 22:13:16

by Singh, Balbir

[permalink] [raw]
Subject: Re: [PATCH] nvme-pci: Shutdown when removing dead controller

On Thu, 2019-10-03 at 15:13 -0400, Tyler Ramer wrote:
> Always shutdown the controller when nvme_remove_dead_controller is
> reached.
>
> It's possible for nvme_remove_dead_controller to be called as part of a
> failed reset, when there is a bad NVME_CSTS. The controller won't
> be comming back online, so we should shut it down rather than just
> disabling.
>

I would add that nvme_timeout() would go through the nvme_should_reset() path
where we don't shutdown the device during nvme_dev_disable, it makes sense to
do a full shutdown during nvme_remove_deal_ctrl work() when reset fails.



> Signed-off-by: Tyler Ramer <[email protected]>
> ---

Reviewed-by: Balbir Singh <[email protected]>

> drivers/nvme/host/pci.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index c0808f9eb8ab..c3f5ba22c625 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -2509,7 +2509,7 @@ static void nvme_pci_free_ctrl(struct nvme_ctrl *ctrl)
> static void nvme_remove_dead_ctrl(struct nvme_dev *dev)
> {
> nvme_get_ctrl(&dev->ctrl);
> - nvme_dev_disable(dev, false);
> + nvme_dev_disable(dev, true);
> nvme_kill_queues(&dev->ctrl);
> if (!queue_work(nvme_wq, &dev->remove_work))
> nvme_put_ctrl(&dev->ctrl);

2019-10-11 14:28:39

by Tyler Ramer

[permalink] [raw]
Subject: [PATCH v3] Always shutdown the controller when nvme_remove_dead_ctrl is reached.

nvme_timeout() will go through nvme_should_reset() path which won't
shutdown a device during nvme_dev_disable(). If the reset fails, then
the controller is bad and won't be coming back online, so it makes sense
to explicitly call a full shutdown during nvme_remove_dead_ctrl().

Signed-off-by: Tyler Ramer <[email protected]>
Reviewed-by: Balbir Singh <[email protected]>

---
Changes since v2:
* Clean up commit message with comment from Balbir
* Still call nvme_kill_queues()
---
drivers/nvme/host/pci.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index c0808f9eb8ab..c3f5ba22c625 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2509,7 +2509,7 @@ static void nvme_pci_free_ctrl(struct nvme_ctrl *ctrl)
static void nvme_remove_dead_ctrl(struct nvme_dev *dev)
{
nvme_get_ctrl(&dev->ctrl);
- nvme_dev_disable(dev, false);
+ nvme_dev_disable(dev, true);
nvme_kill_queues(&dev->ctrl);
if (!queue_work(nvme_wq, &dev->remove_work))
nvme_put_ctrl(&dev->ctrl);
--
2.23.0