LinuxLists.cc - [PATCH] nvme-tcp: Do not reset transport on data digest errors

2021-08-05 17:37:22

Subject: [PATCH] nvme-tcp: Do not reset transport on data digest errors

The spec says

7.4.6.1 Digest Error handling

When a host detects a data digest error in a C2HData PDU, that host
shall continue processing C2HData PDUs associated with the command and
when the command processing has completed, if a successful status was
returned by the controller, the host shall fail the command with a
non-fatal transport error.

Currently the transport is reseted when a data digest error is
detected. To fix this, keep track of the final status in the queue
object and use it when completing the request.

The new member can be placed adjacent to the receive related members and
fits in the cacheline as there is a 4 byte hole.

Signed-off-by: Daniel Wagner <[email protected]>
---

Hi,

I've tested this by modifying the receive path. Via the fault_inject
interface I injecting wrong hash values. The request would then be
completed with status != 0 and nvme_decide_disposition decices to
retry the request. So this seems be in more sync with what the spec
says on this topic.

Daniel

drivers/nvme/host/tcp.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 097f7dd10ed3..5253147df4c7 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -89,6 +89,7 @@ struct nvme_tcp_queue {
size_t data_remaining;
size_t ddgst_remaining;
unsigned int nr_cqe;
+ u16 status;

/* send state */
struct nvme_tcp_request *request;
@@ -496,7 +497,8 @@ static int nvme_tcp_process_nvme_cqe(struct nvme_tcp_queue *queue,
return -EINVAL;
}

- if (!nvme_try_complete_req(rq, cqe->status, cqe->result))
+ if (!nvme_try_complete_req(rq, queue->status ?
+ queue->status : cqe->status, cqe->result))
nvme_complete_rq(rq);
queue->nr_cqe++;

@@ -676,6 +678,7 @@ static int nvme_tcp_recv_pdu(struct nvme_tcp_queue *queue, struct sk_buff *skb,

switch (hdr->type) {
case nvme_tcp_c2h_data:
+ queue->status = NVME_SC_SUCCESS;
return nvme_tcp_handle_c2h_data(queue, (void *)queue->pdu);
case nvme_tcp_rsp:
nvme_tcp_init_recv_ctx(queue);
@@ -758,7 +761,7 @@ static int nvme_tcp_recv_data(struct nvme_tcp_queue *queue, struct sk_buff *skb,
queue->ddgst_remaining = NVME_TCP_DIGEST_LENGTH;
} else {
if (pdu->hdr.flags & NVME_TCP_F_DATA_SUCCESS) {
- nvme_tcp_end_request(rq, NVME_SC_SUCCESS);
+ nvme_tcp_end_request(rq, queue->status);
queue->nr_cqe++;
}
nvme_tcp_init_recv_ctx(queue);
@@ -792,14 +795,14 @@ static int nvme_tcp_recv_ddgst(struct nvme_tcp_queue *queue,
"data digest error: recv %#x expected %#x\n",
le32_to_cpu(queue->recv_ddgst),
le32_to_cpu(queue->exp_ddgst));
- return -EIO;
+ queue->status = NVME_SC_DATA_XFER_ERROR;
}

if (pdu->hdr.flags & NVME_TCP_F_DATA_SUCCESS) {
struct request *rq = nvme_cid_to_rq(nvme_tcp_tagset(queue),
pdu->command_id);

- nvme_tcp_end_request(rq, NVME_SC_SUCCESS);
+ nvme_tcp_end_request(rq, queue->status);
queue->nr_cqe++;
}

--
2.29.2

2021-08-07 00:07:46

by Sagi Grimberg

[permalink] [raw]

Subject: Re: [PATCH] nvme-tcp: Do not reset transport on data digest errors

> The spec says
>
> 7.4.6.1 Digest Error handling
>
> When a host detects a data digest error in a C2HData PDU, that host
> shall continue processing C2HData PDUs associated with the command and
> when the command processing has completed, if a successful status was
> returned by the controller, the host shall fail the command with a
> non-fatal transport error.
>
> Currently the transport is reseted when a data digest error is
> detected. To fix this, keep track of the final status in the queue
> object and use it when completing the request.
>
> The new member can be placed adjacent to the receive related members and
> fits in the cacheline as there is a 4 byte hole.
>
> Signed-off-by: Daniel Wagner <[email protected]>
> ---
>
> Hi,
>
> I've tested this by modifying the receive path. Via the fault_inject
> interface I injecting wrong hash values. The request would then be
> completed with status != 0 and nvme_decide_disposition decices to
> retry the request. So this seems be in more sync with what the spec
> says on this topic.
>
> Daniel
>
> drivers/nvme/host/tcp.c | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 097f7dd10ed3..5253147df4c7 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -89,6 +89,7 @@ struct nvme_tcp_queue {
> size_t data_remaining;
> size_t ddgst_remaining;
> unsigned int nr_cqe;
> + u16 status;

Why is this a queue member and not a request member?

>
> /* send state */
> struct nvme_tcp_request *request;
> @@ -496,7 +497,8 @@ static int nvme_tcp_process_nvme_cqe(struct nvme_tcp_queue *queue,
> return -EINVAL;
> }
>
> - if (!nvme_try_complete_req(rq, cqe->status, cqe->result))
> + if (!nvme_try_complete_req(rq, queue->status ?
> + queue->status : cqe->status, cqe->result))
> nvme_complete_rq(rq);
> queue->nr_cqe++;
>
> @@ -676,6 +678,7 @@ static int nvme_tcp_recv_pdu(struct nvme_tcp_queue *queue, struct sk_buff *skb,
>
> switch (hdr->type) {
> case nvme_tcp_c2h_data:
> + queue->status = NVME_SC_SUCCESS;
> return nvme_tcp_handle_c2h_data(queue, (void *)queue->pdu);
> case nvme_tcp_rsp:
> nvme_tcp_init_recv_ctx(queue);
> @@ -758,7 +761,7 @@ static int nvme_tcp_recv_data(struct nvme_tcp_queue *queue, struct sk_buff *skb,
> queue->ddgst_remaining = NVME_TCP_DIGEST_LENGTH;
> } else {
> if (pdu->hdr.flags & NVME_TCP_F_DATA_SUCCESS) {
> - nvme_tcp_end_request(rq, NVME_SC_SUCCESS);
> + nvme_tcp_end_request(rq, queue->status);
> queue->nr_cqe++;
> }
> nvme_tcp_init_recv_ctx(queue);
> @@ -792,14 +795,14 @@ static int nvme_tcp_recv_ddgst(struct nvme_tcp_queue *queue,
> "data digest error: recv %#x expected %#x\n",
> le32_to_cpu(queue->recv_ddgst),
> le32_to_cpu(queue->exp_ddgst));
> - return -EIO;
> + queue->status = NVME_SC_DATA_XFER_ERROR;
> }
>
> if (pdu->hdr.flags & NVME_TCP_F_DATA_SUCCESS) {
> struct request *rq = nvme_cid_to_rq(nvme_tcp_tagset(queue),
> pdu->command_id);
>
> - nvme_tcp_end_request(rq, NVME_SC_SUCCESS);
> + nvme_tcp_end_request(rq, queue->status);
> queue->nr_cqe++;
> }
>
>

2021-08-09 10:18:53

by Daniel Wagner

[permalink] [raw]

Subject: Re: [PATCH] nvme-tcp: Do not reset transport on data digest errors

Hi Sagi,

On Fri, Aug 06, 2021 at 12:42:00PM -0700, Sagi Grimberg wrote:
> > @@ -89,6 +89,7 @@ struct nvme_tcp_queue {
> > size_t data_remaining;
> > size_t ddgst_remaining;
> > unsigned int nr_cqe;
> > + u16 status;
>
> Why is this a queue member and not a request member?

I was not sure if the TCP transport specific error handling should
impact all other transport (size of struct request). Also I tried to
avoid accessing cachelines which are not already in use. Except this I
don't see there should be no problem to put this to struct request.

Daniel

2021-08-11 01:04:37

by Sagi Grimberg

[permalink] [raw]

Subject: Re: [PATCH] nvme-tcp: Do not reset transport on data digest errors

> Hi Sagi,
>
> On Fri, Aug 06, 2021 at 12:42:00PM -0700, Sagi Grimberg wrote:
>>> @@ -89,6 +89,7 @@ struct nvme_tcp_queue {
>>> size_t data_remaining;
>>> size_t ddgst_remaining;
>>> unsigned int nr_cqe;
>>> + u16 status;
>>
>> Why is this a queue member and not a request member?
>
> I was not sure if the TCP transport specific error handling should
> impact all other transport (size of struct request). Also I tried to
> avoid accessing cachelines which are not already in use. Except this I
> don't see there should be no problem to put this to struct request.

It is the correct place, lets see that it doesn't increase the struct.

2021-08-11 10:35:06

by Daniel Wagner

[permalink] [raw]

Subject: Re: [PATCH] nvme-tcp: Do not reset transport on data digest errors

On Tue, Aug 10, 2021 at 06:02:36PM -0700, Sagi Grimberg wrote:
>
> > Hi Sagi,
> >
> > On Fri, Aug 06, 2021 at 12:42:00PM -0700, Sagi Grimberg wrote:
> > > > @@ -89,6 +89,7 @@ struct nvme_tcp_queue {
> > > > size_t data_remaining;
> > > > size_t ddgst_remaining;
> > > > unsigned int nr_cqe;
> > > > + u16 status;
> > >
> > > Why is this a queue member and not a request member?
> >
> > I was not sure if the TCP transport specific error handling should
> > impact all other transport (size of struct request). Also I tried to
> > avoid accessing cachelines which are not already in use. Except this I
> > don't see there should be no problem to put this to struct request.
>
> It is the correct place, lets see that it doesn't increase the struct.

It should not according pahole:

/* XXX 7 bytes hole, try to pack */

void * pdu; /* 144 8 */
int pdu_remaining; /* 152 4 */
int pdu_offset; /* 156 4 */
size_t data_remaining; /* 160 8 */
size_t ddgst_remaining; /* 168 8 */
unsigned int nr_cqe; /* 176 4 */
u16 status; /* 180 2 */

/* XXX 2 bytes hole, try to pack */

Daniel

2021-08-18 12:47:19

by Daniel Wagner

[permalink] [raw]

Subject: Re: [PATCH] nvme-tcp: Do not reset transport on data digest errors

On Wed, Aug 11, 2021 at 12:31:46PM +0200, Daniel Wagner wrote:
> > It is the correct place, lets see that it doesn't increase the struct.
>
> It should not according pahole:

Anything I need to do to get this patch accepted?

2021-08-20 10:22:48

by Hannes Reinecke

[permalink] [raw]

Subject: Re: [PATCH] nvme-tcp: Do not reset transport on data digest errors

On 8/5/21 2:15 PM, Daniel Wagner wrote:
> The spec says
>
> 7.4.6.1 Digest Error handling
>
> When a host detects a data digest error in a C2HData PDU, that host
> shall continue processing C2HData PDUs associated with the command and
> when the command processing has completed, if a successful status was
> returned by the controller, the host shall fail the command with a
> non-fatal transport error.
>
> Currently the transport is reseted when a data digest error is
> detected. To fix this, keep track of the final status in the queue
> object and use it when completing the request.
>
> The new member can be placed adjacent to the receive related members and
> fits in the cacheline as there is a 4 byte hole.
>
> Signed-off-by: Daniel Wagner <[email protected]>
> ---
>
Reviewed-by: Hannes Reinecke <[email protected]>

Cheers,

Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

2021-08-23 17:16:10

by Sagi Grimberg

[permalink] [raw]

Subject: Re: [PATCH] nvme-tcp: Do not reset transport on data digest errors

>>> It is the correct place, lets see that it doesn't increase the struct.
>>
>> It should not according pahole:
>
> Anything I need to do to get this patch accepted?

Yes, move the status to nvme_tcp_request, I think its
a more appropriate place for it.