2024-04-15 12:43:42

by Daniel Wagner

[permalink] [raw]
Subject: [PATCH v6 0/5] nvme-fabrics: short-circuit connect retries

I've resorted the series so that the nvmet changes come first instead as the
last patch. These changes are necessary to get the host code tested.

The second big changes is moving the logic into the existing helper
nvmf_should_reconnect, all the transports get updated in one go.

And lastely, I've added a retry auth failure patch as requested by Sagi. I
explictly decided against to count these error from the normal retry counter
(nr_reconnects) as this would make it unnecessary complex to map the two
different max limits into one counter.

changes:
v6:
- reorder series
- extended nvmf_should_reconnect
- added auth retry logic

v5:
- nvme: do not mix kernel error code with nvme status
- nvmet: separate nvme status from dhchap status
- https://lore.kernel.org/linux-nvme/[email protected]/

v4:
- rebased
- added 'nvme: fixes for authentication errors' series
- https://lore.kernel.org/all/[email protected]/

v3:
- added my SOB tag
- fixed indention
- https://lore.kernel.org/linux-nvme/[email protected]/

v2:
- refresh/rebase on current head
- extended blktests (nvme/045) to cover this case
(see separate post)
- https://lore.kernel.org/linux-nvme/[email protected]/

v1:
- initial version
- https://lore.kernel.org/linux-nvme/[email protected]/

Daniel Wagner (1):
nvme-fabrics: handle transient auth failures

Hannes Reinecke (4):
nvmet: lock config semaphore when accessing DH-HMAC-CHAP key
nvmet: return DHCHAP status codes from nvmet_setup_auth()
nvme: authentication error are always non-retryable
nvme-fabrics: short-circuit reconnect retries

drivers/nvme/host/auth.c | 4 +-
drivers/nvme/host/core.c | 6 +--
drivers/nvme/host/fabrics.c | 58 +++++++++++++++++---------
drivers/nvme/host/fabrics.h | 2 +-
drivers/nvme/host/fc.c | 6 +--
drivers/nvme/host/nvme.h | 3 +-
drivers/nvme/host/rdma.c | 20 +++++----
drivers/nvme/host/tcp.c | 23 ++++++----
drivers/nvme/target/auth.c | 22 +++++-----
drivers/nvme/target/configfs.c | 22 +++++++---
drivers/nvme/target/fabrics-cmd-auth.c | 49 +++++++++++-----------
drivers/nvme/target/fabrics-cmd.c | 11 ++---
drivers/nvme/target/nvmet.h | 8 ++--
13 files changed, 140 insertions(+), 94 deletions(-)

--
2.44.0



2024-04-15 12:43:58

by Daniel Wagner

[permalink] [raw]
Subject: [PATCH v6 5/5] nvme-fabrics: handle transient auth failures

Retry authentication a couple of times to avoid error out on transient
errors. E.g. if a new key is deployed on the host and the target. At the
same time the connection drops. The host might use the wrong key to
reconnect.

Suggested-by: Sagi Grimberg <[email protected]>
Signed-off-by: Daniel Wagner <[email protected]>
---
drivers/nvme/host/auth.c | 4 ++--
drivers/nvme/host/fabrics.c | 10 +++++++++-
drivers/nvme/host/fc.c | 2 ++
drivers/nvme/host/nvme.h | 1 +
drivers/nvme/host/rdma.c | 1 +
drivers/nvme/host/tcp.c | 1 +
6 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
index a264b3ae078b..368dcc6ee11b 100644
--- a/drivers/nvme/host/auth.c
+++ b/drivers/nvme/host/auth.c
@@ -797,7 +797,7 @@ static void nvme_queue_auth_work(struct work_struct *work)
NVME_AUTH_DHCHAP_MESSAGE_SUCCESS1);
if (ret) {
chap->status = ret;
- chap->error = -ECONNREFUSED;
+ chap->error = -EKEYREJECTED;
return;
}

@@ -818,7 +818,7 @@ static void nvme_queue_auth_work(struct work_struct *work)
ret = nvme_auth_process_dhchap_success1(ctrl, chap);
if (ret) {
/* Controller authentication failed */
- chap->error = -ECONNREFUSED;
+ chap->error = -EKEYREJECTED;
goto fail2;
}

diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index d9a73b1b41c4..6dfa45dce232 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -573,8 +573,16 @@ EXPORT_SYMBOL_GPL(nvmf_connect_io_queue);
*/
bool nvmf_should_reconnect(struct nvme_ctrl *ctrl, int status)
{
- if (status < 0)
+ if (status < 0) {
+ /*
+ * authentication errors can be transient, thus retry a couple
+ * of times before giving up.
+ */
+ if (status == -EKEYREJECTED &&
+ ++ctrl->nr_auth_retries < 3)
+ return true;
return false;
+ }
else if (status > 0 && (status & NVME_SC_DNR))
return false;

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index f0b081332749..a22d2af18553 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3176,6 +3176,7 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl)
changed = nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_LIVE);

ctrl->ctrl.nr_reconnects = 0;
+ ctrl->ctrl.nr_auth_retries = 0;

if (changed)
nvme_start_ctrl(&ctrl->ctrl);
@@ -3491,6 +3492,7 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts,

ctrl->ctrl.opts = opts;
ctrl->ctrl.nr_reconnects = 0;
+ ctrl->ctrl.nr_auth_retries = 0;
INIT_LIST_HEAD(&ctrl->ctrl_list);
ctrl->lport = lport;
ctrl->rport = rport;
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 9b8904a476b8..a9af20f25405 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -382,6 +382,7 @@ struct nvme_ctrl {
u16 icdoff;
u16 maxcmd;
int nr_reconnects;
+ int nr_auth_retries;
unsigned long flags;
struct nvmf_ctrl_options *opts;

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 821ab3e0fd3b..f5b3201a11b5 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1117,6 +1117,7 @@ static void nvme_rdma_reconnect_ctrl_work(struct work_struct *work)
ctrl->ctrl.nr_reconnects);

ctrl->ctrl.nr_reconnects = 0;
+ ctrl->ctrl.nr_auth_retries = 0;

return;

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 3e0c33323320..45270c626c87 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2266,6 +2266,7 @@ static void nvme_tcp_reconnect_ctrl_work(struct work_struct *work)
ctrl->nr_reconnects);

ctrl->nr_reconnects = 0;
+ ctrl->nr_auth_retries = 0;

return;

--
2.44.0


2024-04-15 12:44:00

by Daniel Wagner

[permalink] [raw]
Subject: [PATCH v6 4/5] nvme-fabrics: short-circuit reconnect retries

From: Hannes Reinecke <[email protected]>

Returning a nvme status from nvme_tcp_setup_ctrl() indicates that the
association was established and we have received a status from the
controller; consequently we should honour the DNR bit. If not any future
reconnect attempts will just return the same error, so we can
short-circuit the reconnect attempts and fail the connection directly.

Another reason not to retry reconnects is if the transport returns an
negative error code.

Signed-off-by: Hannes Reinecke <[email protected]>
[dwagner: - extended nvme_should_reconnect]
Signed-off-by: Daniel Wagner <[email protected]>
---
drivers/nvme/host/fabrics.c | 19 ++++++++++++++++++-
drivers/nvme/host/fabrics.h | 2 +-
drivers/nvme/host/fc.c | 4 +---
drivers/nvme/host/rdma.c | 19 ++++++++++++-------
drivers/nvme/host/tcp.c | 22 ++++++++++++++--------
5 files changed, 46 insertions(+), 20 deletions(-)

diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index f7eaf9580b4f..d9a73b1b41c4 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -559,8 +559,25 @@ int nvmf_connect_io_queue(struct nvme_ctrl *ctrl, u16 qid)
}
EXPORT_SYMBOL_GPL(nvmf_connect_io_queue);

-bool nvmf_should_reconnect(struct nvme_ctrl *ctrl)
+/*
+ * Evaluate the status information returned by the transport in order to decided
+ * if a reconnect attempt should be scheduled.
+ *
+ * There are two cases where no reconnect attempt should be attempted:
+ *
+ * 1) The transport reports an negative status. There was an error (e.g. no
+ * memory) on the host side and thus abort the operation.
+ *
+ * 2) The DNR bit is set and the specification states no further connect
+ * attempts with the same set of paramenters should be attempted.
+ */
+bool nvmf_should_reconnect(struct nvme_ctrl *ctrl, int status)
{
+ if (status < 0)
+ return false;
+ else if (status > 0 && (status & NVME_SC_DNR))
+ return false;
+
if (ctrl->opts->max_reconnects == -1 ||
ctrl->nr_reconnects < ctrl->opts->max_reconnects)
return true;
diff --git a/drivers/nvme/host/fabrics.h b/drivers/nvme/host/fabrics.h
index 37c974c38dcb..602135910ae9 100644
--- a/drivers/nvme/host/fabrics.h
+++ b/drivers/nvme/host/fabrics.h
@@ -223,7 +223,7 @@ int nvmf_register_transport(struct nvmf_transport_ops *ops);
void nvmf_unregister_transport(struct nvmf_transport_ops *ops);
void nvmf_free_options(struct nvmf_ctrl_options *opts);
int nvmf_get_address(struct nvme_ctrl *ctrl, char *buf, int size);
-bool nvmf_should_reconnect(struct nvme_ctrl *ctrl);
+bool nvmf_should_reconnect(struct nvme_ctrl *ctrl, int status);
bool nvmf_ip_options_match(struct nvme_ctrl *ctrl,
struct nvmf_ctrl_options *opts);
void nvmf_set_io_queues(struct nvmf_ctrl_options *opts, u32 nr_io_queues,
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index a5b29e9ad342..f0b081332749 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3310,12 +3310,10 @@ nvme_fc_reconnect_or_delete(struct nvme_fc_ctrl *ctrl, int status)
dev_info(ctrl->ctrl.device,
"NVME-FC{%d}: reset: Reconnect attempt failed (%d)\n",
ctrl->cnum, status);
- if (status > 0 && (status & NVME_SC_DNR))
- recon = false;
} else if (time_after_eq(jiffies, rport->dev_loss_end))
recon = false;

- if (recon && nvmf_should_reconnect(&ctrl->ctrl)) {
+ if (recon && nvmf_should_reconnect(&ctrl->ctrl, status)) {
if (portptr->port_state == FC_OBJSTATE_ONLINE)
dev_info(ctrl->ctrl.device,
"NVME-FC{%d}: Reconnect attempt in %ld "
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 366f0bb4ebfc..821ab3e0fd3b 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -982,7 +982,8 @@ static void nvme_rdma_free_ctrl(struct nvme_ctrl *nctrl)
kfree(ctrl);
}

-static void nvme_rdma_reconnect_or_remove(struct nvme_rdma_ctrl *ctrl)
+static void nvme_rdma_reconnect_or_remove(struct nvme_rdma_ctrl *ctrl,
+ int status)
{
enum nvme_ctrl_state state = nvme_ctrl_state(&ctrl->ctrl);

@@ -992,7 +993,7 @@ static void nvme_rdma_reconnect_or_remove(struct nvme_rdma_ctrl *ctrl)
return;
}

- if (nvmf_should_reconnect(&ctrl->ctrl)) {
+ if (nvmf_should_reconnect(&ctrl->ctrl, status)) {
dev_info(ctrl->ctrl.device, "Reconnecting in %d seconds...\n",
ctrl->ctrl.opts->reconnect_delay);
queue_delayed_work(nvme_wq, &ctrl->reconnect_work,
@@ -1104,10 +1105,12 @@ static void nvme_rdma_reconnect_ctrl_work(struct work_struct *work)
{
struct nvme_rdma_ctrl *ctrl = container_of(to_delayed_work(work),
struct nvme_rdma_ctrl, reconnect_work);
+ int ret;

++ctrl->ctrl.nr_reconnects;

- if (nvme_rdma_setup_ctrl(ctrl, false))
+ ret = nvme_rdma_setup_ctrl(ctrl, false);
+ if (ret)
goto requeue;

dev_info(ctrl->ctrl.device, "Successfully reconnected (%d attempts)\n",
@@ -1120,7 +1123,7 @@ static void nvme_rdma_reconnect_ctrl_work(struct work_struct *work)
requeue:
dev_info(ctrl->ctrl.device, "Failed reconnect attempt %d\n",
ctrl->ctrl.nr_reconnects);
- nvme_rdma_reconnect_or_remove(ctrl);
+ nvme_rdma_reconnect_or_remove(ctrl, ret);
}

static void nvme_rdma_error_recovery_work(struct work_struct *work)
@@ -1145,7 +1148,7 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work)
return;
}

- nvme_rdma_reconnect_or_remove(ctrl);
+ nvme_rdma_reconnect_or_remove(ctrl, 0);
}

static void nvme_rdma_error_recovery(struct nvme_rdma_ctrl *ctrl)
@@ -2169,6 +2172,7 @@ static void nvme_rdma_reset_ctrl_work(struct work_struct *work)
{
struct nvme_rdma_ctrl *ctrl =
container_of(work, struct nvme_rdma_ctrl, ctrl.reset_work);
+ int ret;

nvme_stop_ctrl(&ctrl->ctrl);
nvme_rdma_shutdown_ctrl(ctrl, false);
@@ -2179,14 +2183,15 @@ static void nvme_rdma_reset_ctrl_work(struct work_struct *work)
return;
}

- if (nvme_rdma_setup_ctrl(ctrl, false))
+ ret = nvme_rdma_setup_ctrl(ctrl, false);
+ if (ret)
goto out_fail;

return;

out_fail:
++ctrl->ctrl.nr_reconnects;
- nvme_rdma_reconnect_or_remove(ctrl);
+ nvme_rdma_reconnect_or_remove(ctrl, ret);
}

static const struct nvme_ctrl_ops nvme_rdma_ctrl_ops = {
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index fdbcdcedcee9..3e0c33323320 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2155,7 +2155,8 @@ static void nvme_tcp_teardown_io_queues(struct nvme_ctrl *ctrl,
nvme_tcp_destroy_io_queues(ctrl, remove);
}

-static void nvme_tcp_reconnect_or_remove(struct nvme_ctrl *ctrl)
+static void nvme_tcp_reconnect_or_remove(struct nvme_ctrl *ctrl,
+ int status)
{
enum nvme_ctrl_state state = nvme_ctrl_state(ctrl);

@@ -2165,13 +2166,14 @@ static void nvme_tcp_reconnect_or_remove(struct nvme_ctrl *ctrl)
return;
}

- if (nvmf_should_reconnect(ctrl)) {
+ if (nvmf_should_reconnect(ctrl, status)) {
dev_info(ctrl->device, "Reconnecting in %d seconds...\n",
ctrl->opts->reconnect_delay);
queue_delayed_work(nvme_wq, &to_tcp_ctrl(ctrl)->connect_work,
ctrl->opts->reconnect_delay * HZ);
} else {
- dev_info(ctrl->device, "Removing controller...\n");
+ dev_info(ctrl->device, "Removing controller (%d)...\n",
+ status);
nvme_delete_ctrl(ctrl);
}
}
@@ -2252,10 +2254,12 @@ static void nvme_tcp_reconnect_ctrl_work(struct work_struct *work)
struct nvme_tcp_ctrl *tcp_ctrl = container_of(to_delayed_work(work),
struct nvme_tcp_ctrl, connect_work);
struct nvme_ctrl *ctrl = &tcp_ctrl->ctrl;
+ int ret;

++ctrl->nr_reconnects;

- if (nvme_tcp_setup_ctrl(ctrl, false))
+ ret = nvme_tcp_setup_ctrl(ctrl, false);
+ if (ret)
goto requeue;

dev_info(ctrl->device, "Successfully reconnected (%d attempt)\n",
@@ -2268,7 +2272,7 @@ static void nvme_tcp_reconnect_ctrl_work(struct work_struct *work)
requeue:
dev_info(ctrl->device, "Failed reconnect attempt %d\n",
ctrl->nr_reconnects);
- nvme_tcp_reconnect_or_remove(ctrl);
+ nvme_tcp_reconnect_or_remove(ctrl, ret);
}

static void nvme_tcp_error_recovery_work(struct work_struct *work)
@@ -2295,7 +2299,7 @@ static void nvme_tcp_error_recovery_work(struct work_struct *work)
return;
}

- nvme_tcp_reconnect_or_remove(ctrl);
+ nvme_tcp_reconnect_or_remove(ctrl, 0);
}

static void nvme_tcp_teardown_ctrl(struct nvme_ctrl *ctrl, bool shutdown)
@@ -2315,6 +2319,7 @@ static void nvme_reset_ctrl_work(struct work_struct *work)
{
struct nvme_ctrl *ctrl =
container_of(work, struct nvme_ctrl, reset_work);
+ int ret;

nvme_stop_ctrl(ctrl);
nvme_tcp_teardown_ctrl(ctrl, false);
@@ -2328,14 +2333,15 @@ static void nvme_reset_ctrl_work(struct work_struct *work)
return;
}

- if (nvme_tcp_setup_ctrl(ctrl, false))
+ ret = nvme_tcp_setup_ctrl(ctrl, false);
+ if (ret)
goto out_fail;

return;

out_fail:
++ctrl->nr_reconnects;
- nvme_tcp_reconnect_or_remove(ctrl);
+ nvme_tcp_reconnect_or_remove(ctrl, ret);
}

static void nvme_tcp_stop_ctrl(struct nvme_ctrl *ctrl)
--
2.44.0


2024-04-15 20:51:30

by Sagi Grimberg

[permalink] [raw]
Subject: Re: [PATCH v6 5/5] nvme-fabrics: handle transient auth failures



On 15/04/2024 15:42, Daniel Wagner wrote:
> Retry authentication a couple of times to avoid error out on transient
> errors. E.g. if a new key is deployed on the host and the target. At the
> same time the connection drops. The host might use the wrong key to
> reconnect.
>
> Suggested-by: Sagi Grimberg <[email protected]>
> Signed-off-by: Daniel Wagner <[email protected]>
> ---
> drivers/nvme/host/auth.c | 4 ++--
> drivers/nvme/host/fabrics.c | 10 +++++++++-
> drivers/nvme/host/fc.c | 2 ++
> drivers/nvme/host/nvme.h | 1 +
> drivers/nvme/host/rdma.c | 1 +
> drivers/nvme/host/tcp.c | 1 +
> 6 files changed, 16 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
> index a264b3ae078b..368dcc6ee11b 100644
> --- a/drivers/nvme/host/auth.c
> +++ b/drivers/nvme/host/auth.c
> @@ -797,7 +797,7 @@ static void nvme_queue_auth_work(struct work_struct *work)
> NVME_AUTH_DHCHAP_MESSAGE_SUCCESS1);
> if (ret) {
> chap->status = ret;
> - chap->error = -ECONNREFUSED;
> + chap->error = -EKEYREJECTED;
> return;
> }
>
> @@ -818,7 +818,7 @@ static void nvme_queue_auth_work(struct work_struct *work)
> ret = nvme_auth_process_dhchap_success1(ctrl, chap);
> if (ret) {
> /* Controller authentication failed */
> - chap->error = -ECONNREFUSED;
> + chap->error = -EKEYREJECTED;
> goto fail2;
> }
>
> diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
> index d9a73b1b41c4..6dfa45dce232 100644
> --- a/drivers/nvme/host/fabrics.c
> +++ b/drivers/nvme/host/fabrics.c
> @@ -573,8 +573,16 @@ EXPORT_SYMBOL_GPL(nvmf_connect_io_queue);
> */
> bool nvmf_should_reconnect(struct nvme_ctrl *ctrl, int status)
> {
> - if (status < 0)
> + if (status < 0) {
> + /*
> + * authentication errors can be transient, thus retry a couple
> + * of times before giving up.
> + */
> + if (status == -EKEYREJECTED &&
> + ++ctrl->nr_auth_retries < 3)
> + return true;

I did not suggest nr_auth_retries. Where is the 3 coming from? The
controller already
has a number of reconnects before it gives up, no reason to add another one.

Just don't return false based on the status if it is a transient
authentication error.

The patch just needs to be modified from:
    if (status < 0)
to
    if (status < 0 && status != -EKEYREJECTED Plus a comment that
explains it.

2024-04-16 06:57:28

by Daniel Wagner

[permalink] [raw]
Subject: Re: [PATCH v6 5/5] nvme-fabrics: handle transient auth failures

On Mon, Apr 15, 2024 at 11:51:19PM +0300, Sagi Grimberg wrote:
> - if (status < 0)
> > + if (status < 0) {
> > + /*
> > + * authentication errors can be transient, thus retry a couple
> > + * of times before giving up.
> > + */
> > + if (status == -EKEYREJECTED &&
> > + ++ctrl->nr_auth_retries < 3)
> > + return true;
>
> I did not suggest nr_auth_retries.

Correct. I explained why I didn't want to use nr_reconnect counter here.

> Where is the 3 coming from? The controller already
> has a number of reconnects before it gives up, no reason to add
> another one.

Sure, but seem to you consider all EKEYREJECTED errors are transient.
But this is just not correct. There is also the case where the key is
not correct on the first connect attempt for example, or the ctrl key
gets updated but not the host key.

> Just don't return false based on the status if it is a transient
> authentication error.

Not all authentication errors are transient.

> The patch just needs to be modified from
> ??? if (status < 0)
> to
> ??? if (status < 0 && status != -EKEYREJECTED Plus a comment that explains
> it.

This is not correct. This patch is not following the specification
already and I don't think we should bend the rules too much. Hence why I
limited the reconnect attempt on auth errors to 3 attempts.