2022-08-17 19:01:54

by Fabio M. De Francesco

[permalink] [raw]
Subject: [PATCH v2] nvmet-tcp: Don't kmap() pages which can't come from HIGHMEM

kmap() is being deprecated in favor of kmap_local_page().[1]

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

The pages which will be mapped are allocated in nvmet_tcp_map_data(),
using the GFP_KERNEL flag. This assures that they cannot come from
HIGHMEM. This imply that a straight page_address() can replace the kmap()
of sg_page(sg) in nvmet_tcp_map_pdu_iovec(). As a side effect, we might
also delete the field "nr_mapped" from struct "nvmet_tcp_cmd" because,
after removing the kmap() calls, there would be no longer any need of it.

Therefore, replace the kmap() of sg_page(sg) with a page_address() and
delete the "nr_mapped" field from "nvmet_tcp_cmd" and instead pass a
local variable to iov_iter_kvec() from the call site in
nvmet_tcp_map_pdu_iovec().

[1] "[PATCH] checkpatch: Add kmap and kmap_atomic to the deprecated
list" https://lore.kernel.org/all/[email protected]/

Cc: Chaitanya Kulkarni <[email protected]>
Cc: Keith Busch <[email protected]>
Cc: Sagi Grimberg <[email protected]>
Suggested-by: Ira Weiny <[email protected]>
Signed-off-by: Fabio M. De Francesco <[email protected]>
---

v1->v2: Use a local variable as argument of iov_iter_kvec() instead of
the removed "nr_mapped" field from struct "nvmet_tcp_cmd". Thanks to
Sagi and Keith who noticed my mistake.

Many thanks to Chaitanya, Keith, Sagi, for the answers and the comments on
the RFC which led to this patch. The RFC is at:
https://lore.kernel.org/all/[email protected]/

drivers/nvme/target/tcp.c | 28 ++++------------------------
1 file changed, 4 insertions(+), 24 deletions(-)

diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index dc3b4dc8fe08..5b839f842623 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -77,7 +77,6 @@ struct nvmet_tcp_cmd {
u32 pdu_len;
u32 pdu_recv;
int sg_idx;
- int nr_mapped;
struct msghdr recv_msg;
struct kvec *iov;
u32 flags;
@@ -167,7 +166,6 @@ static const struct nvmet_fabrics_ops nvmet_tcp_ops;
static void nvmet_tcp_free_cmd(struct nvmet_tcp_cmd *c);
static void nvmet_tcp_finish_cmd(struct nvmet_tcp_cmd *cmd);
static void nvmet_tcp_free_cmd_buffers(struct nvmet_tcp_cmd *cmd);
-static void nvmet_tcp_unmap_pdu_iovec(struct nvmet_tcp_cmd *cmd);

static inline u16 nvmet_tcp_cmd_tag(struct nvmet_tcp_queue *queue,
struct nvmet_tcp_cmd *cmd)
@@ -301,35 +299,21 @@ static int nvmet_tcp_check_ddgst(struct nvmet_tcp_queue *queue, void *pdu)

static void nvmet_tcp_free_cmd_buffers(struct nvmet_tcp_cmd *cmd)
{
- WARN_ON(unlikely(cmd->nr_mapped > 0));
-
kfree(cmd->iov);
sgl_free(cmd->req.sg);
cmd->iov = NULL;
cmd->req.sg = NULL;
}

-static void nvmet_tcp_unmap_pdu_iovec(struct nvmet_tcp_cmd *cmd)
-{
- struct scatterlist *sg;
- int i;
-
- sg = &cmd->req.sg[cmd->sg_idx];
-
- for (i = 0; i < cmd->nr_mapped; i++)
- kunmap(sg_page(&sg[i]));
-
- cmd->nr_mapped = 0;
-}
-
static void nvmet_tcp_map_pdu_iovec(struct nvmet_tcp_cmd *cmd)
{
struct kvec *iov = cmd->iov;
struct scatterlist *sg;
u32 length, offset, sg_offset;
+ int nr_mapped;

length = cmd->pdu_len;
- cmd->nr_mapped = DIV_ROUND_UP(length, PAGE_SIZE);
+ nr_mapped = DIV_ROUND_UP(length, PAGE_SIZE);
offset = cmd->rbytes_done;
cmd->sg_idx = offset / PAGE_SIZE;
sg_offset = offset % PAGE_SIZE;
@@ -338,7 +322,7 @@ static void nvmet_tcp_map_pdu_iovec(struct nvmet_tcp_cmd *cmd)
while (length) {
u32 iov_len = min_t(u32, length, sg->length - sg_offset);

- iov->iov_base = kmap(sg_page(sg)) + sg->offset + sg_offset;
+ iov->iov_base = page_address(sg_page(sg)) + sg->offset + sg_offset;
iov->iov_len = iov_len;

length -= iov_len;
@@ -347,8 +331,7 @@ static void nvmet_tcp_map_pdu_iovec(struct nvmet_tcp_cmd *cmd)
sg_offset = 0;
}

- iov_iter_kvec(&cmd->recv_msg.msg_iter, READ, cmd->iov,
- cmd->nr_mapped, cmd->pdu_len);
+ iov_iter_kvec(&cmd->recv_msg.msg_iter, READ, cmd->iov, nr_mapped, cmd->pdu_len);
}

static void nvmet_tcp_fatal_error(struct nvmet_tcp_queue *queue)
@@ -1141,7 +1124,6 @@ static int nvmet_tcp_try_recv_data(struct nvmet_tcp_queue *queue)
cmd->rbytes_done += ret;
}

- nvmet_tcp_unmap_pdu_iovec(cmd);
if (queue->data_digest) {
nvmet_tcp_prep_recv_ddgst(cmd);
return 0;
@@ -1411,7 +1393,6 @@ static void nvmet_tcp_restore_socket_callbacks(struct nvmet_tcp_queue *queue)
static void nvmet_tcp_finish_cmd(struct nvmet_tcp_cmd *cmd)
{
nvmet_req_uninit(&cmd->req);
- nvmet_tcp_unmap_pdu_iovec(cmd);
nvmet_tcp_free_cmd_buffers(cmd);
}

@@ -1424,7 +1405,6 @@ static void nvmet_tcp_uninit_data_in_cmds(struct nvmet_tcp_queue *queue)
if (nvmet_tcp_need_data_in(cmd))
nvmet_req_uninit(&cmd->req);

- nvmet_tcp_unmap_pdu_iovec(cmd);
nvmet_tcp_free_cmd_buffers(cmd);
}

--
2.37.1


2022-08-18 11:52:44

by Sagi Grimberg

[permalink] [raw]
Subject: Re: [PATCH v2] nvmet-tcp: Don't kmap() pages which can't come from HIGHMEM

Reviewed-by: Sagi Grimberg <[email protected]>

Fabio, did you run blktests?

The simplest thing you can do is:

$ git clone https://github.com/osandov/blktests.git
$ cd blktests
$ nvme_trtype=tcp ./check nvme

2022-08-18 15:51:20

by Chaitanya Kulkarni

[permalink] [raw]
Subject: Re: [PATCH v2] nvmet-tcp: Don't kmap() pages which can't come from HIGHMEM


> - iov_iter_kvec(&cmd->recv_msg.msg_iter, READ, cmd->iov,
> - cmd->nr_mapped, cmd->pdu_len);
> + iov_iter_kvec(&cmd->recv_msg.msg_iter, READ, cmd->iov, nr_mapped, cmd->pdu_len);
> }
>

overly long lines above ? we are keeping lines < 80 for consistency, can
be done at the time of applying patch.

-ck

2022-08-21 05:56:39

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2] nvmet-tcp: Don't kmap() pages which can't come from HIGHMEM

This introduced tons of pointless over 80 character lines, please stick
to a proper coding style.

Also or in-kernel I/O wouldn't it make more sense to use a BVEC iter
insted of a KVEC if we have the pages anyway? Or does networking not
support them properly?

2022-08-21 15:18:36

by Fabio M. De Francesco

[permalink] [raw]
Subject: Re: [PATCH v2] nvmet-tcp: Don't kmap() pages which can't come from HIGHMEM

On giovedì 18 agosto 2022 13:45:49 CEST Sagi Grimberg wrote:
> Reviewed-by: Sagi Grimberg <[email protected]>
>
> Fabio, did you run blktests?
>
> The simplest thing you can do is:
>
> $ git clone https://github.com/osandov/blktests.git
> $ cd blktests
> $ nvme_trtype=tcp ./check nvme

I just run the tests you requested.
I'm going to send a v3 patch which shows the results.

Thanks,

Fabio



2022-08-21 15:20:39

by Fabio M. De Francesco

[permalink] [raw]
Subject: Re: [PATCH v2] nvmet-tcp: Don't kmap() pages which can't come from HIGHMEM

On giovedì 18 agosto 2022 17:48:04 CEST Chaitanya Kulkarni wrote:
> > - iov_iter_kvec(&cmd->recv_msg.msg_iter, READ, cmd->iov,
> > - cmd->nr_mapped, cmd->pdu_len);
> > + iov_iter_kvec(&cmd->recv_msg.msg_iter, READ, cmd->iov, nr_mapped,
> > cmd->pdu_len);
>
> > }
> >
>
>
> overly long lines above ? we are keeping lines < 80 for consistency, can
> be done at the time of applying patch.
>
> -ck
>

I'm sorry for this.
I changed my post commit hook and forgot to make it executable :-(
I just wrote to Sagi and said that I'm going to send a new version with the
results of blktests. In the meantime I'll shorten those long lines below the
80 characters threshold.

Thanks,

Fabio


2022-08-21 19:26:38

by Fabio M. De Francesco

[permalink] [raw]
Subject: Re: [PATCH v2] nvmet-tcp: Don't kmap() pages which can't come from HIGHMEM

On domenica 21 agosto 2022 07:46:02 CEST Christoph Hellwig wrote:
> This introduced tons of pointless over 80 character lines, please stick
> to a proper coding style.

I'm sorry for this mistake. I'll break those long lines in the next version.

> Also or in-kernel I/O wouldn't it make more sense to use a BVEC iter
> insted of a KVEC if we have the pages anyway? Or does networking not
> support them properly?

I must admit I knew practically nothing about this code until I met it while
doing my long journey towards kmap() and kmap_atomic() removals. I can say the
same for regard to the differences between BVEC iter and KVEC iter.

Since you are asking this, today I did some research and read code from other
drivers/subsystems, despite this particular issue is out of the scope of my
patch.

I also read an interesting article in LWN: "The iov_iter interface" at
https://lwn.net/Articles/625077/

I may very well be wrong, however I think that iov_iter_kvec() is better
suited here.

@Sagi, anything to add to this discussion?

Thanks,

Fabio