2023-05-30 14:26:43

by David Howells

[permalink] [raw]
Subject: [PATCH net-next v2 00/10] crypto, splice, net: Make AF_ALG handle sendmsg(MSG_SPLICE_PAGES)

Here's the fourth tranche of patches towards providing a MSG_SPLICE_PAGES
internal sendmsg flag that is intended to replace the ->sendpage() op with
calls to sendmsg(). MSG_SPLICE_PAGES is a hint that tells the protocol
that it should splice the pages supplied if it can.

This set consists of the following parts:

(1) Move netfs_extract_iter_to_sg() to somewhere more general and rename
it to drop the "netfs" prefix. We use this to extract directly from
an iterator into a scatterlist.

(2) Make AF_ALG use iov_iter_extract_pages(). This has the additional
effect of pinning pages obtained from userspace rather than taking
refs on them. Pages from kernel-backed iterators would not be pinned,
but AF_ALG isn't really meant for use by kernel services.

(3) Change AF_ALG still further to use extract_iter_to_sg().

(4) Make af_alg_sendmsg() support MSG_SPLICE_PAGES support and make
af_alg_sendpage() just a wrapper around sendmsg(). This has to take
refs on the pages pinned for the moment.

(5) Make hash_sendmsg() support MSG_SPLICE_PAGES by simply ignoring it.
hash_sendpage() is left untouched to be removed later, after the
splice core has been changed to call sendmsg().

I've pushed the patches here also:

https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=sendpage-4

David

ver #2)
- Put the "netfs_" prefix removal first to shorten lines and avoid
checkpatch 80-char warnings.
- Fix a couple of spelling mistakes.
- Wrap some lines at 80 chars.

Link: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=51c78a4d532efe9543a4df019ff405f05c6157f6 # part 1
Link: https://lore.kernel.org/r/[email protected]/ # v1

David Howells (10):
Drop the netfs_ prefix from netfs_extract_iter_to_sg()
Fix a couple of spelling mistakes
Wrap lines at 80
Move netfs_extract_iter_to_sg() to lib/scatterlist.c
crypto: af_alg: Pin pages rather than ref'ing if appropriate
crypto: af_alg: Use extract_iter_to_sg() to create scatterlists
crypto: af_alg: Indent the loop in af_alg_sendmsg()
crypto: af_alg: Support MSG_SPLICE_PAGES
crypto: af_alg: Convert af_alg_sendpage() to use MSG_SPLICE_PAGES
crypto: af_alg/hash: Support MSG_SPLICE_PAGES

crypto/af_alg.c | 185 ++++++++++++---------------
crypto/algif_aead.c | 38 +++---
crypto/algif_hash.c | 114 +++++++++++------
crypto/algif_skcipher.c | 10 +-
fs/cifs/smb2ops.c | 4 +-
fs/cifs/smbdirect.c | 2 +-
fs/netfs/iterator.c | 266 ---------------------------------------
include/crypto/if_alg.h | 7 +-
include/linux/netfs.h | 4 -
include/linux/uio.h | 5 +
lib/scatterlist.c | 269 ++++++++++++++++++++++++++++++++++++++++
11 files changed, 459 insertions(+), 445 deletions(-)



2023-05-30 14:26:43

by David Howells

[permalink] [raw]
Subject: [PATCH net-next v2 04/10] Move netfs_extract_iter_to_sg() to lib/scatterlist.c

Move netfs_extract_iter_to_sg() to lib/scatterlist.c as it's going to be
used by more than just network filesystems (AF_ALG, for example).

Signed-off-by: David Howells <[email protected]>
cc: Jeff Layton <[email protected]>
cc: Steve French <[email protected]>
cc: Shyam Prasad N <[email protected]>
cc: Rohith Surabattula <[email protected]>
cc: Jens Axboe <[email protected]>
cc: Herbert Xu <[email protected]>
cc: "David S. Miller" <[email protected]>
cc: Eric Dumazet <[email protected]>
cc: Jakub Kicinski <[email protected]>
cc: Paolo Abeni <[email protected]>
cc: Matthew Wilcox <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/netfs/iterator.c | 267 -----------------------------------------
include/linux/netfs.h | 4 -
include/linux/uio.h | 5 +
lib/scatterlist.c | 269 ++++++++++++++++++++++++++++++++++++++++++
4 files changed, 274 insertions(+), 271 deletions(-)

diff --git a/fs/netfs/iterator.c b/fs/netfs/iterator.c
index 9f09dc30ceb6..2ff07ba655a0 100644
--- a/fs/netfs/iterator.c
+++ b/fs/netfs/iterator.c
@@ -101,270 +101,3 @@ ssize_t netfs_extract_user_iter(struct iov_iter *orig, size_t orig_len,
return npages;
}
EXPORT_SYMBOL_GPL(netfs_extract_user_iter);
-
-/*
- * Extract and pin a list of up to sg_max pages from UBUF- or IOVEC-class
- * iterators, and add them to the scatterlist.
- */
-static ssize_t extract_user_to_sg(struct iov_iter *iter,
- ssize_t maxsize,
- struct sg_table *sgtable,
- unsigned int sg_max,
- iov_iter_extraction_t extraction_flags)
-{
- struct scatterlist *sg = sgtable->sgl + sgtable->nents;
- struct page **pages;
- unsigned int npages;
- ssize_t ret = 0, res;
- size_t len, off;
-
- /* We decant the page list into the tail of the scatterlist */
- pages = (void *)sgtable->sgl +
- array_size(sg_max, sizeof(struct scatterlist));
- pages -= sg_max;
-
- do {
- res = iov_iter_extract_pages(iter, &pages, maxsize, sg_max,
- extraction_flags, &off);
- if (res < 0)
- goto failed;
-
- len = res;
- maxsize -= len;
- ret += len;
- npages = DIV_ROUND_UP(off + len, PAGE_SIZE);
- sg_max -= npages;
-
- for (; npages > 0; npages--) {
- struct page *page = *pages;
- size_t seg = min_t(size_t, PAGE_SIZE - off, len);
-
- *pages++ = NULL;
- sg_set_page(sg, page, seg, off);
- sgtable->nents++;
- sg++;
- len -= seg;
- off = 0;
- }
- } while (maxsize > 0 && sg_max > 0);
-
- return ret;
-
-failed:
- while (sgtable->nents > sgtable->orig_nents)
- put_page(sg_page(&sgtable->sgl[--sgtable->nents]));
- return res;
-}
-
-/*
- * Extract up to sg_max pages from a BVEC-type iterator and add them to the
- * scatterlist. The pages are not pinned.
- */
-static ssize_t extract_bvec_to_sg(struct iov_iter *iter,
- ssize_t maxsize,
- struct sg_table *sgtable,
- unsigned int sg_max,
- iov_iter_extraction_t extraction_flags)
-{
- const struct bio_vec *bv = iter->bvec;
- struct scatterlist *sg = sgtable->sgl + sgtable->nents;
- unsigned long start = iter->iov_offset;
- unsigned int i;
- ssize_t ret = 0;
-
- for (i = 0; i < iter->nr_segs; i++) {
- size_t off, len;
-
- len = bv[i].bv_len;
- if (start >= len) {
- start -= len;
- continue;
- }
-
- len = min_t(size_t, maxsize, len - start);
- off = bv[i].bv_offset + start;
-
- sg_set_page(sg, bv[i].bv_page, len, off);
- sgtable->nents++;
- sg++;
- sg_max--;
-
- ret += len;
- maxsize -= len;
- if (maxsize <= 0 || sg_max == 0)
- break;
- start = 0;
- }
-
- if (ret > 0)
- iov_iter_advance(iter, ret);
- return ret;
-}
-
-/*
- * Extract up to sg_max pages from a KVEC-type iterator and add them to the
- * scatterlist. This can deal with vmalloc'd buffers as well as kmalloc'd or
- * static buffers. The pages are not pinned.
- */
-static ssize_t extract_kvec_to_sg(struct iov_iter *iter,
- ssize_t maxsize,
- struct sg_table *sgtable,
- unsigned int sg_max,
- iov_iter_extraction_t extraction_flags)
-{
- const struct kvec *kv = iter->kvec;
- struct scatterlist *sg = sgtable->sgl + sgtable->nents;
- unsigned long start = iter->iov_offset;
- unsigned int i;
- ssize_t ret = 0;
-
- for (i = 0; i < iter->nr_segs; i++) {
- struct page *page;
- unsigned long kaddr;
- size_t off, len, seg;
-
- len = kv[i].iov_len;
- if (start >= len) {
- start -= len;
- continue;
- }
-
- kaddr = (unsigned long)kv[i].iov_base + start;
- off = kaddr & ~PAGE_MASK;
- len = min_t(size_t, maxsize, len - start);
- kaddr &= PAGE_MASK;
-
- maxsize -= len;
- ret += len;
- do {
- seg = min_t(size_t, len, PAGE_SIZE - off);
- if (is_vmalloc_or_module_addr((void *)kaddr))
- page = vmalloc_to_page((void *)kaddr);
- else
- page = virt_to_page(kaddr);
-
- sg_set_page(sg, page, len, off);
- sgtable->nents++;
- sg++;
- sg_max--;
-
- len -= seg;
- kaddr += PAGE_SIZE;
- off = 0;
- } while (len > 0 && sg_max > 0);
-
- if (maxsize <= 0 || sg_max == 0)
- break;
- start = 0;
- }
-
- if (ret > 0)
- iov_iter_advance(iter, ret);
- return ret;
-}
-
-/*
- * Extract up to sg_max folios from an XARRAY-type iterator and add them to
- * the scatterlist. The pages are not pinned.
- */
-static ssize_t extract_xarray_to_sg(struct iov_iter *iter,
- ssize_t maxsize,
- struct sg_table *sgtable,
- unsigned int sg_max,
- iov_iter_extraction_t extraction_flags)
-{
- struct scatterlist *sg = sgtable->sgl + sgtable->nents;
- struct xarray *xa = iter->xarray;
- struct folio *folio;
- loff_t start = iter->xarray_start + iter->iov_offset;
- pgoff_t index = start / PAGE_SIZE;
- ssize_t ret = 0;
- size_t offset, len;
- XA_STATE(xas, xa, index);
-
- rcu_read_lock();
-
- xas_for_each(&xas, folio, ULONG_MAX) {
- if (xas_retry(&xas, folio))
- continue;
- if (WARN_ON(xa_is_value(folio)))
- break;
- if (WARN_ON(folio_test_hugetlb(folio)))
- break;
-
- offset = offset_in_folio(folio, start);
- len = min_t(size_t, maxsize, folio_size(folio) - offset);
-
- sg_set_page(sg, folio_page(folio, 0), len, offset);
- sgtable->nents++;
- sg++;
- sg_max--;
-
- maxsize -= len;
- ret += len;
- if (maxsize <= 0 || sg_max == 0)
- break;
- }
-
- rcu_read_unlock();
- if (ret > 0)
- iov_iter_advance(iter, ret);
- return ret;
-}
-
-/**
- * extract_iter_to_sg - Extract pages from an iterator and add to an sglist
- * @iter: The iterator to extract from
- * @maxsize: The amount of iterator to copy
- * @sgtable: The scatterlist table to fill in
- * @sg_max: Maximum number of elements in @sgtable that may be filled
- * @extraction_flags: Flags to qualify the request
- *
- * Extract the page fragments from the given amount of the source iterator and
- * add them to a scatterlist that refers to all of those bits, to a maximum
- * addition of @sg_max elements.
- *
- * The pages referred to by UBUF- and IOVEC-type iterators are extracted and
- * pinned; BVEC-, KVEC- and XARRAY-type are extracted but aren't pinned; PIPE-
- * and DISCARD-type are not supported.
- *
- * No end mark is placed on the scatterlist; that's left to the caller.
- *
- * @extraction_flags can have ITER_ALLOW_P2PDMA set to request peer-to-peer DMA
- * be allowed on the pages extracted.
- *
- * If successful, @sgtable->nents is updated to include the number of elements
- * added and the number of bytes added is returned. @sgtable->orig_nents is
- * left unaltered.
- *
- * The iov_iter_extract_mode() function should be used to query how cleanup
- * should be performed.
- */
-ssize_t extract_iter_to_sg(struct iov_iter *iter, size_t maxsize,
- struct sg_table *sgtable, unsigned int sg_max,
- iov_iter_extraction_t extraction_flags)
-{
- if (maxsize == 0)
- return 0;
-
- switch (iov_iter_type(iter)) {
- case ITER_UBUF:
- case ITER_IOVEC:
- return extract_user_to_sg(iter, maxsize, sgtable, sg_max,
- extraction_flags);
- case ITER_BVEC:
- return extract_bvec_to_sg(iter, maxsize, sgtable, sg_max,
- extraction_flags);
- case ITER_KVEC:
- return extract_kvec_to_sg(iter, maxsize, sgtable, sg_max,
- extraction_flags);
- case ITER_XARRAY:
- return extract_xarray_to_sg(iter, maxsize, sgtable, sg_max,
- extraction_flags);
- default:
- pr_err("%s(%u) unsupported\n", __func__, iov_iter_type(iter));
- WARN_ON_ONCE(1);
- return -EIO;
- }
-}
-EXPORT_SYMBOL_GPL(extract_iter_to_sg);
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 55e201c3a841..b11a84f6c32b 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -300,10 +300,6 @@ void netfs_stats_show(struct seq_file *);
ssize_t netfs_extract_user_iter(struct iov_iter *orig, size_t orig_len,
struct iov_iter *new,
iov_iter_extraction_t extraction_flags);
-struct sg_table;
-ssize_t extract_iter_to_sg(struct iov_iter *iter, size_t len,
- struct sg_table *sgtable, unsigned int sg_max,
- iov_iter_extraction_t extraction_flags);

/**
* netfs_inode - Get the netfs inode context from the inode
diff --git a/include/linux/uio.h b/include/linux/uio.h
index 044c1d8c230c..0ccb983cf645 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -433,4 +433,9 @@ static inline bool iov_iter_extract_will_pin(const struct iov_iter *iter)
return user_backed_iter(iter);
}

+struct sg_table;
+ssize_t extract_iter_to_sg(struct iov_iter *iter, size_t len,
+ struct sg_table *sgtable, unsigned int sg_max,
+ iov_iter_extraction_t extraction_flags);
+
#endif
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index 8d7519a8f308..e97d7060329e 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -9,6 +9,8 @@
#include <linux/scatterlist.h>
#include <linux/highmem.h>
#include <linux/kmemleak.h>
+#include <linux/bvec.h>
+#include <linux/uio.h>

/**
* sg_next - return the next scatterlist entry in a list
@@ -1095,3 +1097,270 @@ size_t sg_zero_buffer(struct scatterlist *sgl, unsigned int nents,
return offset;
}
EXPORT_SYMBOL(sg_zero_buffer);
+
+/*
+ * Extract and pin a list of up to sg_max pages from UBUF- or IOVEC-class
+ * iterators, and add them to the scatterlist.
+ */
+static ssize_t extract_user_to_sg(struct iov_iter *iter,
+ ssize_t maxsize,
+ struct sg_table *sgtable,
+ unsigned int sg_max,
+ iov_iter_extraction_t extraction_flags)
+{
+ struct scatterlist *sg = sgtable->sgl + sgtable->nents;
+ struct page **pages;
+ unsigned int npages;
+ ssize_t ret = 0, res;
+ size_t len, off;
+
+ /* We decant the page list into the tail of the scatterlist */
+ pages = (void *)sgtable->sgl +
+ array_size(sg_max, sizeof(struct scatterlist));
+ pages -= sg_max;
+
+ do {
+ res = iov_iter_extract_pages(iter, &pages, maxsize, sg_max,
+ extraction_flags, &off);
+ if (res < 0)
+ goto failed;
+
+ len = res;
+ maxsize -= len;
+ ret += len;
+ npages = DIV_ROUND_UP(off + len, PAGE_SIZE);
+ sg_max -= npages;
+
+ for (; npages > 0; npages--) {
+ struct page *page = *pages;
+ size_t seg = min_t(size_t, PAGE_SIZE - off, len);
+
+ *pages++ = NULL;
+ sg_set_page(sg, page, seg, off);
+ sgtable->nents++;
+ sg++;
+ len -= seg;
+ off = 0;
+ }
+ } while (maxsize > 0 && sg_max > 0);
+
+ return ret;
+
+failed:
+ while (sgtable->nents > sgtable->orig_nents)
+ put_page(sg_page(&sgtable->sgl[--sgtable->nents]));
+ return res;
+}
+
+/*
+ * Extract up to sg_max pages from a BVEC-type iterator and add them to the
+ * scatterlist. The pages are not pinned.
+ */
+static ssize_t extract_bvec_to_sg(struct iov_iter *iter,
+ ssize_t maxsize,
+ struct sg_table *sgtable,
+ unsigned int sg_max,
+ iov_iter_extraction_t extraction_flags)
+{
+ const struct bio_vec *bv = iter->bvec;
+ struct scatterlist *sg = sgtable->sgl + sgtable->nents;
+ unsigned long start = iter->iov_offset;
+ unsigned int i;
+ ssize_t ret = 0;
+
+ for (i = 0; i < iter->nr_segs; i++) {
+ size_t off, len;
+
+ len = bv[i].bv_len;
+ if (start >= len) {
+ start -= len;
+ continue;
+ }
+
+ len = min_t(size_t, maxsize, len - start);
+ off = bv[i].bv_offset + start;
+
+ sg_set_page(sg, bv[i].bv_page, len, off);
+ sgtable->nents++;
+ sg++;
+ sg_max--;
+
+ ret += len;
+ maxsize -= len;
+ if (maxsize <= 0 || sg_max == 0)
+ break;
+ start = 0;
+ }
+
+ if (ret > 0)
+ iov_iter_advance(iter, ret);
+ return ret;
+}
+
+/*
+ * Extract up to sg_max pages from a KVEC-type iterator and add them to the
+ * scatterlist. This can deal with vmalloc'd buffers as well as kmalloc'd or
+ * static buffers. The pages are not pinned.
+ */
+static ssize_t extract_kvec_to_sg(struct iov_iter *iter,
+ ssize_t maxsize,
+ struct sg_table *sgtable,
+ unsigned int sg_max,
+ iov_iter_extraction_t extraction_flags)
+{
+ const struct kvec *kv = iter->kvec;
+ struct scatterlist *sg = sgtable->sgl + sgtable->nents;
+ unsigned long start = iter->iov_offset;
+ unsigned int i;
+ ssize_t ret = 0;
+
+ for (i = 0; i < iter->nr_segs; i++) {
+ struct page *page;
+ unsigned long kaddr;
+ size_t off, len, seg;
+
+ len = kv[i].iov_len;
+ if (start >= len) {
+ start -= len;
+ continue;
+ }
+
+ kaddr = (unsigned long)kv[i].iov_base + start;
+ off = kaddr & ~PAGE_MASK;
+ len = min_t(size_t, maxsize, len - start);
+ kaddr &= PAGE_MASK;
+
+ maxsize -= len;
+ ret += len;
+ do {
+ seg = min_t(size_t, len, PAGE_SIZE - off);
+ if (is_vmalloc_or_module_addr((void *)kaddr))
+ page = vmalloc_to_page((void *)kaddr);
+ else
+ page = virt_to_page(kaddr);
+
+ sg_set_page(sg, page, len, off);
+ sgtable->nents++;
+ sg++;
+ sg_max--;
+
+ len -= seg;
+ kaddr += PAGE_SIZE;
+ off = 0;
+ } while (len > 0 && sg_max > 0);
+
+ if (maxsize <= 0 || sg_max == 0)
+ break;
+ start = 0;
+ }
+
+ if (ret > 0)
+ iov_iter_advance(iter, ret);
+ return ret;
+}
+
+/*
+ * Extract up to sg_max folios from an XARRAY-type iterator and add them to
+ * the scatterlist. The pages are not pinned.
+ */
+static ssize_t extract_xarray_to_sg(struct iov_iter *iter,
+ ssize_t maxsize,
+ struct sg_table *sgtable,
+ unsigned int sg_max,
+ iov_iter_extraction_t extraction_flags)
+{
+ struct scatterlist *sg = sgtable->sgl + sgtable->nents;
+ struct xarray *xa = iter->xarray;
+ struct folio *folio;
+ loff_t start = iter->xarray_start + iter->iov_offset;
+ pgoff_t index = start / PAGE_SIZE;
+ ssize_t ret = 0;
+ size_t offset, len;
+ XA_STATE(xas, xa, index);
+
+ rcu_read_lock();
+
+ xas_for_each(&xas, folio, ULONG_MAX) {
+ if (xas_retry(&xas, folio))
+ continue;
+ if (WARN_ON(xa_is_value(folio)))
+ break;
+ if (WARN_ON(folio_test_hugetlb(folio)))
+ break;
+
+ offset = offset_in_folio(folio, start);
+ len = min_t(size_t, maxsize, folio_size(folio) - offset);
+
+ sg_set_page(sg, folio_page(folio, 0), len, offset);
+ sgtable->nents++;
+ sg++;
+ sg_max--;
+
+ maxsize -= len;
+ ret += len;
+ if (maxsize <= 0 || sg_max == 0)
+ break;
+ }
+
+ rcu_read_unlock();
+ if (ret > 0)
+ iov_iter_advance(iter, ret);
+ return ret;
+}
+
+/**
+ * extract_iter_to_sg - Extract pages from an iterator and add to an sglist
+ * @iter: The iterator to extract from
+ * @maxsize: The amount of iterator to copy
+ * @sgtable: The scatterlist table to fill in
+ * @sg_max: Maximum number of elements in @sgtable that may be filled
+ * @extraction_flags: Flags to qualify the request
+ *
+ * Extract the page fragments from the given amount of the source iterator and
+ * add them to a scatterlist that refers to all of those bits, to a maximum
+ * addition of @sg_max elements.
+ *
+ * The pages referred to by UBUF- and IOVEC-type iterators are extracted and
+ * pinned; BVEC-, KVEC- and XARRAY-type are extracted but aren't pinned; PIPE-
+ * and DISCARD-type are not supported.
+ *
+ * No end mark is placed on the scatterlist; that's left to the caller.
+ *
+ * @extraction_flags can have ITER_ALLOW_P2PDMA set to request peer-to-peer DMA
+ * be allowed on the pages extracted.
+ *
+ * If successful, @sgtable->nents is updated to include the number of elements
+ * added and the number of bytes added is returned. @sgtable->orig_nents is
+ * left unaltered.
+ *
+ * The iov_iter_extract_mode() function should be used to query how cleanup
+ * should be performed.
+ */
+ssize_t extract_iter_to_sg(struct iov_iter *iter, size_t maxsize,
+ struct sg_table *sgtable, unsigned int sg_max,
+ iov_iter_extraction_t extraction_flags)
+{
+ if (maxsize == 0)
+ return 0;
+
+ switch (iov_iter_type(iter)) {
+ case ITER_UBUF:
+ case ITER_IOVEC:
+ return extract_user_to_sg(iter, maxsize, sgtable, sg_max,
+ extraction_flags);
+ case ITER_BVEC:
+ return extract_bvec_to_sg(iter, maxsize, sgtable, sg_max,
+ extraction_flags);
+ case ITER_KVEC:
+ return extract_kvec_to_sg(iter, maxsize, sgtable, sg_max,
+ extraction_flags);
+ case ITER_XARRAY:
+ return extract_xarray_to_sg(iter, maxsize, sgtable, sg_max,
+ extraction_flags);
+ default:
+ pr_err("%s(%u) unsupported\n", __func__, iov_iter_type(iter));
+ WARN_ON_ONCE(1);
+ return -EIO;
+ }
+}
+EXPORT_SYMBOL_GPL(extract_iter_to_sg);


2023-05-30 14:26:48

by David Howells

[permalink] [raw]
Subject: [PATCH net-next v2 03/10] Wrap lines at 80

Wrap a line at 80 to stop checkpatch complaining.

Signed-off-by: David Howells <[email protected]>
cc: Jeff Layton <[email protected]>
cc: Steve French <[email protected]>
cc: Shyam Prasad N <[email protected]>
cc: Rohith Surabattula <[email protected]>
cc: Jens Axboe <[email protected]>
cc: Herbert Xu <[email protected]>
cc: "David S. Miller" <[email protected]>
cc: Eric Dumazet <[email protected]>
cc: Jakub Kicinski <[email protected]>
cc: Paolo Abeni <[email protected]>
cc: Matthew Wilcox <[email protected]>
cc: Simon Horman <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/netfs/iterator.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/netfs/iterator.c b/fs/netfs/iterator.c
index f41a37bca1e8..9f09dc30ceb6 100644
--- a/fs/netfs/iterator.c
+++ b/fs/netfs/iterator.c
@@ -119,7 +119,8 @@ static ssize_t extract_user_to_sg(struct iov_iter *iter,
size_t len, off;

/* We decant the page list into the tail of the scatterlist */
- pages = (void *)sgtable->sgl + array_size(sg_max, sizeof(struct scatterlist));
+ pages = (void *)sgtable->sgl +
+ array_size(sg_max, sizeof(struct scatterlist));
pages -= sg_max;

do {


2023-05-30 14:26:55

by David Howells

[permalink] [raw]
Subject: [PATCH net-next v2 05/10] crypto: af_alg: Pin pages rather than ref'ing if appropriate

Convert AF_ALG to use iov_iter_extract_pages() instead of
iov_iter_get_pages(). This will pin pages or leave them unaltered rather
than getting a ref on them as appropriate to the iterator.

The pages need to be pinned for DIO-read rather than having refs taken on
them to prevent VM copy-on-write from malfunctioning during a concurrent
fork() (the result of the I/O would otherwise end up only visible to the
child process and not the parent).

Signed-off-by: David Howells <[email protected]>
cc: Herbert Xu <[email protected]>
cc: "David S. Miller" <[email protected]>
cc: Eric Dumazet <[email protected]>
cc: Jakub Kicinski <[email protected]>
cc: Paolo Abeni <[email protected]>
cc: Jens Axboe <[email protected]>
cc: Matthew Wilcox <[email protected]>
cc: [email protected]
cc: [email protected]
---
crypto/af_alg.c | 10 +++++++---
include/crypto/if_alg.h | 1 +
2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 5f7252a5b7b4..7caff10df643 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -533,14 +533,17 @@ static const struct net_proto_family alg_family = {

int af_alg_make_sg(struct af_alg_sgl *sgl, struct iov_iter *iter, int len)
{
+ struct page **pages = sgl->pages;
size_t off;
ssize_t n;
int npages, i;

- n = iov_iter_get_pages2(iter, sgl->pages, len, ALG_MAX_PAGES, &off);
+ n = iov_iter_extract_pages(iter, &pages, len, ALG_MAX_PAGES, 0, &off);
if (n < 0)
return n;

+ sgl->need_unpin = iov_iter_extract_will_pin(iter);
+
npages = DIV_ROUND_UP(off + n, PAGE_SIZE);
if (WARN_ON(npages == 0))
return -EINVAL;
@@ -573,8 +576,9 @@ void af_alg_free_sg(struct af_alg_sgl *sgl)
{
int i;

- for (i = 0; i < sgl->npages; i++)
- put_page(sgl->pages[i]);
+ if (sgl->need_unpin)
+ for (i = 0; i < sgl->npages; i++)
+ unpin_user_page(sgl->pages[i]);
}
EXPORT_SYMBOL_GPL(af_alg_free_sg);

diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h
index 7e76623f9ec3..46494b33f5bc 100644
--- a/include/crypto/if_alg.h
+++ b/include/crypto/if_alg.h
@@ -59,6 +59,7 @@ struct af_alg_sgl {
struct scatterlist sg[ALG_MAX_PAGES + 1];
struct page *pages[ALG_MAX_PAGES];
unsigned int npages;
+ bool need_unpin;
};

/* TX SGL entry */


2023-05-30 14:27:00

by David Howells

[permalink] [raw]
Subject: [PATCH net-next v2 10/10] crypto: af_alg/hash: Support MSG_SPLICE_PAGES

Make AF_ALG sendmsg() support MSG_SPLICE_PAGES in the hashing code. This
causes pages to be spliced from the source iterator if possible.

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Signed-off-by: David Howells <[email protected]>
cc: Herbert Xu <[email protected]>
cc: "David S. Miller" <[email protected]>
cc: Eric Dumazet <[email protected]>
cc: Jakub Kicinski <[email protected]>
cc: Paolo Abeni <[email protected]>
cc: Jens Axboe <[email protected]>
cc: Matthew Wilcox <[email protected]>
cc: [email protected]
cc: [email protected]
---

Notes:
ver #2)
- Fixed some checkpatch warnings.

crypto/af_alg.c | 11 +++--
crypto/algif_hash.c | 104 ++++++++++++++++++++++++++++----------------
2 files changed, 74 insertions(+), 41 deletions(-)

diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index e2fc9051ba39..b78a399d0e19 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -542,9 +542,14 @@ void af_alg_free_sg(struct af_alg_sgl *sgl)
{
int i;

- if (sgl->need_unpin)
- for (i = 0; i < sgl->sgt.nents; i++)
- unpin_user_page(sg_page(&sgl->sgt.sgl[i]));
+ if (sgl->sgt.sgl) {
+ if (sgl->need_unpin)
+ for (i = 0; i < sgl->sgt.nents; i++)
+ unpin_user_page(sg_page(&sgl->sgt.sgl[i]));
+ if (sgl->sgt.sgl != sgl->sgl)
+ kvfree(sgl->sgt.sgl);
+ sgl->sgt.sgl = NULL;
+ }
}
EXPORT_SYMBOL_GPL(af_alg_free_sg);

diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index 16c69c4b9c62..2f7a98b0eae3 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -63,78 +63,106 @@ static void hash_free_result(struct sock *sk, struct hash_ctx *ctx)
static int hash_sendmsg(struct socket *sock, struct msghdr *msg,
size_t ignored)
{
- int limit = ALG_MAX_PAGES * PAGE_SIZE;
struct sock *sk = sock->sk;
struct alg_sock *ask = alg_sk(sk);
struct hash_ctx *ctx = ask->private;
- long copied = 0;
+ ssize_t copied = 0;
+ size_t len, max_pages = ALG_MAX_PAGES, npages;
+ bool continuing = ctx->more, need_init = false;
int err;

- if (limit > sk->sk_sndbuf)
- limit = sk->sk_sndbuf;
+ /* Don't limit to ALG_MAX_PAGES if the pages are all already pinned. */
+ if (!user_backed_iter(&msg->msg_iter))
+ max_pages = INT_MAX;
+ else
+ max_pages = min_t(size_t, max_pages,
+ DIV_ROUND_UP(sk->sk_sndbuf, PAGE_SIZE));

lock_sock(sk);
- if (!ctx->more) {
+ if (!continuing) {
if ((msg->msg_flags & MSG_MORE))
hash_free_result(sk, ctx);
-
- err = crypto_wait_req(crypto_ahash_init(&ctx->req), &ctx->wait);
- if (err)
- goto unlock;
+ need_init = true;
}

ctx->more = false;

while (msg_data_left(msg)) {
- int len = msg_data_left(msg);
-
- if (len > limit)
- len = limit;
-
ctx->sgl.sgt.sgl = ctx->sgl.sgl;
ctx->sgl.sgt.nents = 0;
ctx->sgl.sgt.orig_nents = 0;

- len = extract_iter_to_sg(&msg->msg_iter, len, &ctx->sgl.sgt,
- ALG_MAX_PAGES, 0);
- if (len < 0) {
- err = copied ? 0 : len;
- goto unlock;
+ err = -EIO;
+ npages = iov_iter_npages(&msg->msg_iter, max_pages);
+ if (npages == 0)
+ goto unlock_free;
+
+ if (npages > ARRAY_SIZE(ctx->sgl.sgl)) {
+ err = -ENOMEM;
+ ctx->sgl.sgt.sgl =
+ kvmalloc(array_size(npages,
+ sizeof(*ctx->sgl.sgt.sgl)),
+ GFP_KERNEL);
+ if (!ctx->sgl.sgt.sgl)
+ goto unlock_free;
}
- sg_mark_end(ctx->sgl.sgt.sgl + ctx->sgl.sgt.nents);
+ sg_init_table(ctx->sgl.sgl, npages);

ctx->sgl.need_unpin = iov_iter_extract_will_pin(&msg->msg_iter);

- ahash_request_set_crypt(&ctx->req, ctx->sgl.sgt.sgl, NULL, len);
+ err = extract_iter_to_sg(&msg->msg_iter, LONG_MAX,
+ &ctx->sgl.sgt, npages, 0);
+ if (err < 0)
+ goto unlock_free;
+ len = err;
+ sg_mark_end(ctx->sgl.sgt.sgl + ctx->sgl.sgt.nents - 1);

- err = crypto_wait_req(crypto_ahash_update(&ctx->req),
- &ctx->wait);
- af_alg_free_sg(&ctx->sgl);
- if (err) {
- iov_iter_revert(&msg->msg_iter, len);
- goto unlock;
+ if (!msg_data_left(msg)) {
+ err = hash_alloc_result(sk, ctx);
+ if (err)
+ goto unlock_free;
}

- copied += len;
- }
+ ahash_request_set_crypt(&ctx->req, ctx->sgl.sgt.sgl,
+ ctx->result, len);

- err = 0;
+ if (!msg_data_left(msg) && !continuing &&
+ !(msg->msg_flags & MSG_MORE)) {
+ err = crypto_ahash_digest(&ctx->req);
+ } else {
+ if (need_init) {
+ err = crypto_wait_req(
+ crypto_ahash_init(&ctx->req),
+ &ctx->wait);
+ if (err)
+ goto unlock_free;
+ need_init = false;
+ }
+
+ if (msg_data_left(msg) || (msg->msg_flags & MSG_MORE))
+ err = crypto_ahash_update(&ctx->req);
+ else
+ err = crypto_ahash_finup(&ctx->req);
+ continuing = true;
+ }

- ctx->more = msg->msg_flags & MSG_MORE;
- if (!ctx->more) {
- err = hash_alloc_result(sk, ctx);
+ err = crypto_wait_req(err, &ctx->wait);
if (err)
- goto unlock;
+ goto unlock_free;

- ahash_request_set_crypt(&ctx->req, NULL, ctx->result, 0);
- err = crypto_wait_req(crypto_ahash_final(&ctx->req),
- &ctx->wait);
+ copied += len;
+ af_alg_free_sg(&ctx->sgl);
}

+ ctx->more = msg->msg_flags & MSG_MORE;
+ err = 0;
unlock:
release_sock(sk);
+ return copied ?: err;

- return err ?: copied;
+unlock_free:
+ af_alg_free_sg(&ctx->sgl);
+ goto unlock;
}

static ssize_t hash_sendpage(struct socket *sock, struct page *page,


2023-05-30 14:27:02

by David Howells

[permalink] [raw]
Subject: [PATCH net-next v2 02/10] Fix a couple of spelling mistakes

Fix a couple of spelling mistakes in a comment.

Suggested-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]/
Link: https://lore.kernel.org/r/[email protected]/
Signed-off-by: David Howells <[email protected]>
cc: Jeff Layton <[email protected]>
cc: Steve French <[email protected]>
cc: Shyam Prasad N <[email protected]>
cc: Rohith Surabattula <[email protected]>
cc: Jens Axboe <[email protected]>
cc: Herbert Xu <[email protected]>
cc: "David S. Miller" <[email protected]>
cc: Eric Dumazet <[email protected]>
cc: Jakub Kicinski <[email protected]>
cc: Paolo Abeni <[email protected]>
cc: Matthew Wilcox <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/netfs/iterator.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/netfs/iterator.c b/fs/netfs/iterator.c
index f8eba3de1a97..f41a37bca1e8 100644
--- a/fs/netfs/iterator.c
+++ b/fs/netfs/iterator.c
@@ -312,7 +312,7 @@ static ssize_t extract_xarray_to_sg(struct iov_iter *iter,
}

/**
- * extract_iter_to_sg - Extract pages from an iterator and add ot an sglist
+ * extract_iter_to_sg - Extract pages from an iterator and add to an sglist
* @iter: The iterator to extract from
* @maxsize: The amount of iterator to copy
* @sgtable: The scatterlist table to fill in
@@ -332,7 +332,7 @@ static ssize_t extract_xarray_to_sg(struct iov_iter *iter,
* @extraction_flags can have ITER_ALLOW_P2PDMA set to request peer-to-peer DMA
* be allowed on the pages extracted.
*
- * If successul, @sgtable->nents is updated to include the number of elements
+ * If successful, @sgtable->nents is updated to include the number of elements
* added and the number of bytes added is returned. @sgtable->orig_nents is
* left unaltered.
*


2023-05-30 14:27:02

by David Howells

[permalink] [raw]
Subject: [PATCH net-next v2 07/10] crypto: af_alg: Indent the loop in af_alg_sendmsg()

Put the loop in af_alg_sendmsg() into an if-statement to indent it to make
the next patch easier to review as that will add another branch to handle
MSG_SPLICE_PAGES to the if-statement.

Signed-off-by: David Howells <[email protected]>
cc: Herbert Xu <[email protected]>
cc: "David S. Miller" <[email protected]>
cc: Eric Dumazet <[email protected]>
cc: Jakub Kicinski <[email protected]>
cc: Paolo Abeni <[email protected]>
cc: Jens Axboe <[email protected]>
cc: Matthew Wilcox <[email protected]>
cc: [email protected]
cc: [email protected]
---

Notes:
ver #2)
- Fix a checkpatch warning.

crypto/af_alg.c | 51 ++++++++++++++++++++++++++-----------------------
1 file changed, 27 insertions(+), 24 deletions(-)

diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index b8bf6d8525ba..fd56ccff6fed 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -1030,35 +1030,38 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
if (sgl->cur)
sg_unmark_end(sg + sgl->cur - 1);

- do {
- struct page *pg;
- unsigned int i = sgl->cur;
+ if (1 /* TODO check MSG_SPLICE_PAGES */) {
+ do {
+ struct page *pg;
+ unsigned int i = sgl->cur;

- plen = min_t(size_t, len, PAGE_SIZE);
+ plen = min_t(size_t, len, PAGE_SIZE);

- pg = alloc_page(GFP_KERNEL);
- if (!pg) {
- err = -ENOMEM;
- goto unlock;
- }
+ pg = alloc_page(GFP_KERNEL);
+ if (!pg) {
+ err = -ENOMEM;
+ goto unlock;
+ }

- sg_assign_page(sg + i, pg);
+ sg_assign_page(sg + i, pg);

- err = memcpy_from_msg(page_address(sg_page(sg + i)),
- msg, plen);
- if (err) {
- __free_page(sg_page(sg + i));
- sg_assign_page(sg + i, NULL);
- goto unlock;
- }
+ err = memcpy_from_msg(
+ page_address(sg_page(sg + i)),
+ msg, plen);
+ if (err) {
+ __free_page(sg_page(sg + i));
+ sg_assign_page(sg + i, NULL);
+ goto unlock;
+ }

- sg[i].length = plen;
- len -= plen;
- ctx->used += plen;
- copied += plen;
- size -= plen;
- sgl->cur++;
- } while (len && sgl->cur < MAX_SGL_ENTS);
+ sg[i].length = plen;
+ len -= plen;
+ ctx->used += plen;
+ copied += plen;
+ size -= plen;
+ sgl->cur++;
+ } while (len && sgl->cur < MAX_SGL_ENTS);
+ }

if (!size)
sg_mark_end(sg + sgl->cur - 1);


2023-05-30 14:27:08

by David Howells

[permalink] [raw]
Subject: [PATCH net-next v2 08/10] crypto: af_alg: Support MSG_SPLICE_PAGES

Make AF_ALG sendmsg() support MSG_SPLICE_PAGES. This causes pages to be
spliced from the source iterator.

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Signed-off-by: David Howells <[email protected]>
cc: Herbert Xu <[email protected]>
cc: "David S. Miller" <[email protected]>
cc: Eric Dumazet <[email protected]>
cc: Jakub Kicinski <[email protected]>
cc: Paolo Abeni <[email protected]>
cc: Jens Axboe <[email protected]>
cc: Matthew Wilcox <[email protected]>
cc: [email protected]
cc: [email protected]
---
crypto/af_alg.c | 28 ++++++++++++++++++++++++++--
crypto/algif_aead.c | 22 +++++++++++-----------
crypto/algif_skcipher.c | 8 ++++----
3 files changed, 41 insertions(+), 17 deletions(-)

diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index fd56ccff6fed..62f4205d42e3 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -940,6 +940,10 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
bool init = false;
int err = 0;

+ if ((msg->msg_flags & MSG_SPLICE_PAGES) &&
+ !iov_iter_is_bvec(&msg->msg_iter))
+ return -EINVAL;
+
if (msg->msg_controllen) {
err = af_alg_cmsg_send(msg, &con);
if (err)
@@ -985,7 +989,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
while (size) {
struct scatterlist *sg;
size_t len = size;
- size_t plen;
+ ssize_t plen;

/* use the existing memory in an allocated page */
if (ctx->merge) {
@@ -1030,7 +1034,27 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
if (sgl->cur)
sg_unmark_end(sg + sgl->cur - 1);

- if (1 /* TODO check MSG_SPLICE_PAGES */) {
+ if (msg->msg_flags & MSG_SPLICE_PAGES) {
+ struct sg_table sgtable = {
+ .sgl = sg,
+ .nents = sgl->cur,
+ .orig_nents = sgl->cur,
+ };
+
+ plen = extract_iter_to_sg(&msg->msg_iter, len, &sgtable,
+ MAX_SGL_ENTS, 0);
+ if (plen < 0) {
+ err = plen;
+ goto unlock;
+ }
+
+ for (; sgl->cur < sgtable.nents; sgl->cur++)
+ get_page(sg_page(&sg[sgl->cur]));
+ len -= plen;
+ ctx->used += plen;
+ copied += plen;
+ size -= plen;
+ } else {
do {
struct page *pg;
unsigned int i = sgl->cur;
diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
index 829878025dba..35bfa283748d 100644
--- a/crypto/algif_aead.c
+++ b/crypto/algif_aead.c
@@ -9,8 +9,8 @@
* The following concept of the memory management is used:
*
* The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is
- * filled by user space with the data submitted via sendpage/sendmsg. Filling
- * up the TX SGL does not cause a crypto operation -- the data will only be
+ * filled by user space with the data submitted via sendpage. Filling up
+ * the TX SGL does not cause a crypto operation -- the data will only be
* tracked by the kernel. Upon receipt of one recvmsg call, the caller must
* provide a buffer which is tracked with the RX SGL.
*
@@ -113,19 +113,19 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg,
}

/*
- * Data length provided by caller via sendmsg/sendpage that has not
- * yet been processed.
+ * Data length provided by caller via sendmsg that has not yet been
+ * processed.
*/
used = ctx->used;

/*
- * Make sure sufficient data is present -- note, the same check is
- * also present in sendmsg/sendpage. The checks in sendpage/sendmsg
- * shall provide an information to the data sender that something is
- * wrong, but they are irrelevant to maintain the kernel integrity.
- * We need this check here too in case user space decides to not honor
- * the error message in sendmsg/sendpage and still call recvmsg. This
- * check here protects the kernel integrity.
+ * Make sure sufficient data is present -- note, the same check is also
+ * present in sendmsg. The checks in sendmsg shall provide an
+ * information to the data sender that something is wrong, but they are
+ * irrelevant to maintain the kernel integrity. We need this check
+ * here too in case user space decides to not honor the error message
+ * in sendmsg and still call recvmsg. This check here protects the
+ * kernel integrity.
*/
if (!aead_sufficient_data(sk))
return -EINVAL;
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index a251cd6bd5b9..b1f321b9f846 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -9,10 +9,10 @@
* The following concept of the memory management is used:
*
* The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is
- * filled by user space with the data submitted via sendpage/sendmsg. Filling
- * up the TX SGL does not cause a crypto operation -- the data will only be
- * tracked by the kernel. Upon receipt of one recvmsg call, the caller must
- * provide a buffer which is tracked with the RX SGL.
+ * filled by user space with the data submitted via sendmsg. Filling up the TX
+ * SGL does not cause a crypto operation -- the data will only be tracked by
+ * the kernel. Upon receipt of one recvmsg call, the caller must provide a
+ * buffer which is tracked with the RX SGL.
*
* During the processing of the recvmsg operation, the cipher request is
* allocated and prepared. As part of the recvmsg operation, the processed


2023-05-30 14:27:20

by David Howells

[permalink] [raw]
Subject: [PATCH net-next v2 06/10] crypto: af_alg: Use extract_iter_to_sg() to create scatterlists

Use extract_iter_to_sg() to decant the destination iterator into a
scatterlist in af_alg_get_rsgl(). af_alg_make_sg() can then be removed.

Signed-off-by: David Howells <[email protected]>
cc: Herbert Xu <[email protected]>
cc: "David S. Miller" <[email protected]>
cc: Eric Dumazet <[email protected]>
cc: Jakub Kicinski <[email protected]>
cc: Paolo Abeni <[email protected]>
cc: Jens Axboe <[email protected]>
cc: Matthew Wilcox <[email protected]>
cc: [email protected]
cc: [email protected]
---

Notes:
ver #2)
- Fix some checkpatch warnings.

crypto/af_alg.c | 57 +++++++++++------------------------------
crypto/algif_aead.c | 16 +++++++-----
crypto/algif_hash.c | 18 +++++++++----
crypto/algif_skcipher.c | 2 +-
include/crypto/if_alg.h | 6 ++---
5 files changed, 40 insertions(+), 59 deletions(-)

diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 7caff10df643..b8bf6d8525ba 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -531,45 +531,11 @@ static const struct net_proto_family alg_family = {
.owner = THIS_MODULE,
};

-int af_alg_make_sg(struct af_alg_sgl *sgl, struct iov_iter *iter, int len)
-{
- struct page **pages = sgl->pages;
- size_t off;
- ssize_t n;
- int npages, i;
-
- n = iov_iter_extract_pages(iter, &pages, len, ALG_MAX_PAGES, 0, &off);
- if (n < 0)
- return n;
-
- sgl->need_unpin = iov_iter_extract_will_pin(iter);
-
- npages = DIV_ROUND_UP(off + n, PAGE_SIZE);
- if (WARN_ON(npages == 0))
- return -EINVAL;
- /* Add one extra for linking */
- sg_init_table(sgl->sg, npages + 1);
-
- for (i = 0, len = n; i < npages; i++) {
- int plen = min_t(int, len, PAGE_SIZE - off);
-
- sg_set_page(sgl->sg + i, sgl->pages[i], plen, off);
-
- off = 0;
- len -= plen;
- }
- sg_mark_end(sgl->sg + npages - 1);
- sgl->npages = npages;
-
- return n;
-}
-EXPORT_SYMBOL_GPL(af_alg_make_sg);
-
static void af_alg_link_sg(struct af_alg_sgl *sgl_prev,
struct af_alg_sgl *sgl_new)
{
- sg_unmark_end(sgl_prev->sg + sgl_prev->npages - 1);
- sg_chain(sgl_prev->sg, sgl_prev->npages + 1, sgl_new->sg);
+ sg_unmark_end(sgl_prev->sgt.sgl + sgl_prev->sgt.nents - 1);
+ sg_chain(sgl_prev->sgt.sgl, sgl_prev->sgt.nents + 1, sgl_new->sgt.sgl);
}

void af_alg_free_sg(struct af_alg_sgl *sgl)
@@ -577,8 +543,8 @@ void af_alg_free_sg(struct af_alg_sgl *sgl)
int i;

if (sgl->need_unpin)
- for (i = 0; i < sgl->npages; i++)
- unpin_user_page(sgl->pages[i]);
+ for (i = 0; i < sgl->sgt.nents; i++)
+ unpin_user_page(sg_page(&sgl->sgt.sgl[i]));
}
EXPORT_SYMBOL_GPL(af_alg_free_sg);

@@ -1292,8 +1258,8 @@ int af_alg_get_rsgl(struct sock *sk, struct msghdr *msg, int flags,

while (maxsize > len && msg_data_left(msg)) {
struct af_alg_rsgl *rsgl;
+ ssize_t err;
size_t seglen;
- int err;

/* limit the amount of readable buffers */
if (!af_alg_readable(sk))
@@ -1310,16 +1276,23 @@ int af_alg_get_rsgl(struct sock *sk, struct msghdr *msg, int flags,
return -ENOMEM;
}

- rsgl->sgl.npages = 0;
+ rsgl->sgl.sgt.sgl = rsgl->sgl.sgl;
+ rsgl->sgl.sgt.nents = 0;
+ rsgl->sgl.sgt.orig_nents = 0;
list_add_tail(&rsgl->list, &areq->rsgl_list);

- /* make one iovec available as scatterlist */
- err = af_alg_make_sg(&rsgl->sgl, &msg->msg_iter, seglen);
+ sg_init_table(rsgl->sgl.sgt.sgl, ALG_MAX_PAGES);
+ err = extract_iter_to_sg(&msg->msg_iter, seglen, &rsgl->sgl.sgt,
+ ALG_MAX_PAGES, 0);
if (err < 0) {
rsgl->sg_num_bytes = 0;
return err;
}

+ sg_mark_end(rsgl->sgl.sgt.sgl + rsgl->sgl.sgt.nents - 1);
+ rsgl->sgl.need_unpin =
+ iov_iter_extract_will_pin(&msg->msg_iter);
+
/* chain the new scatterlist with previous one */
if (areq->last_rsgl)
af_alg_link_sg(&areq->last_rsgl->sgl, &rsgl->sgl);
diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
index 42493b4d8ce4..829878025dba 100644
--- a/crypto/algif_aead.c
+++ b/crypto/algif_aead.c
@@ -210,7 +210,7 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg,
*/

/* Use the RX SGL as source (and destination) for crypto op. */
- rsgl_src = areq->first_rsgl.sgl.sg;
+ rsgl_src = areq->first_rsgl.sgl.sgt.sgl;

if (ctx->enc) {
/*
@@ -224,7 +224,8 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg,
* RX SGL: AAD || PT || Tag
*/
err = crypto_aead_copy_sgl(null_tfm, tsgl_src,
- areq->first_rsgl.sgl.sg, processed);
+ areq->first_rsgl.sgl.sgt.sgl,
+ processed);
if (err)
goto free;
af_alg_pull_tsgl(sk, processed, NULL, 0);
@@ -242,7 +243,8 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg,

/* Copy AAD || CT to RX SGL buffer for in-place operation. */
err = crypto_aead_copy_sgl(null_tfm, tsgl_src,
- areq->first_rsgl.sgl.sg, outlen);
+ areq->first_rsgl.sgl.sgt.sgl,
+ outlen);
if (err)
goto free;

@@ -267,10 +269,10 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg,
if (usedpages) {
/* RX SGL present */
struct af_alg_sgl *sgl_prev = &areq->last_rsgl->sgl;
+ struct scatterlist *sg = sgl_prev->sgt.sgl;

- sg_unmark_end(sgl_prev->sg + sgl_prev->npages - 1);
- sg_chain(sgl_prev->sg, sgl_prev->npages + 1,
- areq->tsgl);
+ sg_unmark_end(sg + sgl_prev->sgt.nents - 1);
+ sg_chain(sg, sgl_prev->sgt.nents + 1, areq->tsgl);
} else
/* no RX SGL present (e.g. authentication only) */
rsgl_src = areq->tsgl;
@@ -278,7 +280,7 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg,

/* Initialize the crypto operation */
aead_request_set_crypt(&areq->cra_u.aead_req, rsgl_src,
- areq->first_rsgl.sgl.sg, used, ctx->iv);
+ areq->first_rsgl.sgl.sgt.sgl, used, ctx->iv);
aead_request_set_ad(&areq->cra_u.aead_req, ctx->aead_assoclen);
aead_request_set_tfm(&areq->cra_u.aead_req, tfm);

diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index 63af72e19fa8..16c69c4b9c62 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -91,13 +91,21 @@ static int hash_sendmsg(struct socket *sock, struct msghdr *msg,
if (len > limit)
len = limit;

- len = af_alg_make_sg(&ctx->sgl, &msg->msg_iter, len);
+ ctx->sgl.sgt.sgl = ctx->sgl.sgl;
+ ctx->sgl.sgt.nents = 0;
+ ctx->sgl.sgt.orig_nents = 0;
+
+ len = extract_iter_to_sg(&msg->msg_iter, len, &ctx->sgl.sgt,
+ ALG_MAX_PAGES, 0);
if (len < 0) {
err = copied ? 0 : len;
goto unlock;
}
+ sg_mark_end(ctx->sgl.sgt.sgl + ctx->sgl.sgt.nents);
+
+ ctx->sgl.need_unpin = iov_iter_extract_will_pin(&msg->msg_iter);

- ahash_request_set_crypt(&ctx->req, ctx->sgl.sg, NULL, len);
+ ahash_request_set_crypt(&ctx->req, ctx->sgl.sgt.sgl, NULL, len);

err = crypto_wait_req(crypto_ahash_update(&ctx->req),
&ctx->wait);
@@ -141,8 +149,8 @@ static ssize_t hash_sendpage(struct socket *sock, struct page *page,
flags |= MSG_MORE;

lock_sock(sk);
- sg_init_table(ctx->sgl.sg, 1);
- sg_set_page(ctx->sgl.sg, page, size, offset);
+ sg_init_table(ctx->sgl.sgl, 1);
+ sg_set_page(ctx->sgl.sgl, page, size, offset);

if (!(flags & MSG_MORE)) {
err = hash_alloc_result(sk, ctx);
@@ -151,7 +159,7 @@ static ssize_t hash_sendpage(struct socket *sock, struct page *page,
} else if (!ctx->more)
hash_free_result(sk, ctx);

- ahash_request_set_crypt(&ctx->req, ctx->sgl.sg, ctx->result, size);
+ ahash_request_set_crypt(&ctx->req, ctx->sgl.sgl, ctx->result, size);

if (!(flags & MSG_MORE)) {
if (ctx->more)
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index ee8890ee8f33..a251cd6bd5b9 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -105,7 +105,7 @@ static int _skcipher_recvmsg(struct socket *sock, struct msghdr *msg,
/* Initialize the crypto operation */
skcipher_request_set_tfm(&areq->cra_u.skcipher_req, tfm);
skcipher_request_set_crypt(&areq->cra_u.skcipher_req, areq->tsgl,
- areq->first_rsgl.sgl.sg, len, ctx->iv);
+ areq->first_rsgl.sgl.sgt.sgl, len, ctx->iv);

if (msg->msg_iocb && !is_sync_kiocb(msg->msg_iocb)) {
/* AIO operation */
diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h
index 46494b33f5bc..34224e77f5a2 100644
--- a/include/crypto/if_alg.h
+++ b/include/crypto/if_alg.h
@@ -56,9 +56,8 @@ struct af_alg_type {
};

struct af_alg_sgl {
- struct scatterlist sg[ALG_MAX_PAGES + 1];
- struct page *pages[ALG_MAX_PAGES];
- unsigned int npages;
+ struct sg_table sgt;
+ struct scatterlist sgl[ALG_MAX_PAGES + 1];
bool need_unpin;
};

@@ -164,7 +163,6 @@ int af_alg_release(struct socket *sock);
void af_alg_release_parent(struct sock *sk);
int af_alg_accept(struct sock *sk, struct socket *newsock, bool kern);

-int af_alg_make_sg(struct af_alg_sgl *sgl, struct iov_iter *iter, int len);
void af_alg_free_sg(struct af_alg_sgl *sgl);

static inline struct alg_sock *alg_sk(struct sock *sk)


2023-05-30 14:27:27

by David Howells

[permalink] [raw]
Subject: [PATCH net-next v2 09/10] crypto: af_alg: Convert af_alg_sendpage() to use MSG_SPLICE_PAGES

Convert af_alg_sendpage() to use sendmsg() with MSG_SPLICE_PAGES rather
than directly splicing in the pages itself.

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Signed-off-by: David Howells <[email protected]>
cc: Herbert Xu <[email protected]>
cc: "David S. Miller" <[email protected]>
cc: Eric Dumazet <[email protected]>
cc: Jakub Kicinski <[email protected]>
cc: Paolo Abeni <[email protected]>
cc: Jens Axboe <[email protected]>
cc: Matthew Wilcox <[email protected]>
cc: [email protected]
cc: [email protected]
---
crypto/af_alg.c | 52 ++++++++-----------------------------------------
1 file changed, 8 insertions(+), 44 deletions(-)

diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 62f4205d42e3..e2fc9051ba39 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -1118,53 +1118,17 @@ EXPORT_SYMBOL_GPL(af_alg_sendmsg);
ssize_t af_alg_sendpage(struct socket *sock, struct page *page,
int offset, size_t size, int flags)
{
- struct sock *sk = sock->sk;
- struct alg_sock *ask = alg_sk(sk);
- struct af_alg_ctx *ctx = ask->private;
- struct af_alg_tsgl *sgl;
- int err = -EINVAL;
+ struct bio_vec bvec;
+ struct msghdr msg = {
+ .msg_flags = flags | MSG_SPLICE_PAGES,
+ };

if (flags & MSG_SENDPAGE_NOTLAST)
- flags |= MSG_MORE;
-
- lock_sock(sk);
- if (!ctx->more && ctx->used)
- goto unlock;
-
- if (!size)
- goto done;
-
- if (!af_alg_writable(sk)) {
- err = af_alg_wait_for_wmem(sk, flags);
- if (err)
- goto unlock;
- }
-
- err = af_alg_alloc_tsgl(sk);
- if (err)
- goto unlock;
-
- ctx->merge = 0;
- sgl = list_entry(ctx->tsgl_list.prev, struct af_alg_tsgl, list);
-
- if (sgl->cur)
- sg_unmark_end(sgl->sg + sgl->cur - 1);
-
- sg_mark_end(sgl->sg + sgl->cur);
-
- get_page(page);
- sg_set_page(sgl->sg + sgl->cur, page, size, offset);
- sgl->cur++;
- ctx->used += size;
-
-done:
- ctx->more = flags & MSG_MORE;
-
-unlock:
- af_alg_data_wakeup(sk);
- release_sock(sk);
+ msg.msg_flags |= MSG_MORE;

- return err ?: size;
+ bvec_set_page(&bvec, page, size, offset);
+ iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
+ return sock_sendmsg(sock, &msg);
}
EXPORT_SYMBOL_GPL(af_alg_sendpage);



2023-05-30 19:56:56

by Simon Horman

[permalink] [raw]
Subject: Re: [PATCH net-next v2 02/10] Fix a couple of spelling mistakes

On Tue, May 30, 2023 at 03:16:26PM +0100, David Howells wrote:
> Fix a couple of spelling mistakes in a comment.
>
> Suggested-by: Simon Horman <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]/
> Link: https://lore.kernel.org/r/[email protected]/
> Signed-off-by: David Howells <[email protected]>

Reviewed-by: Simon Horman <[email protected]>


2023-06-01 09:56:41

by Paolo Abeni

[permalink] [raw]
Subject: Re: [PATCH net-next v2 08/10] crypto: af_alg: Support MSG_SPLICE_PAGES

On Tue, 2023-05-30 at 15:16 +0100, David Howells wrote:
> Make AF_ALG sendmsg() support MSG_SPLICE_PAGES. This causes pages to be
> spliced from the source iterator.
>
> This allows ->sendpage() to be replaced by something that can handle
> multiple multipage folios in a single transaction.
>
> Signed-off-by: David Howells <[email protected]>
> cc: Herbert Xu <[email protected]>
> cc: "David S. Miller" <[email protected]>
> cc: Eric Dumazet <[email protected]>
> cc: Jakub Kicinski <[email protected]>
> cc: Paolo Abeni <[email protected]>
> cc: Jens Axboe <[email protected]>
> cc: Matthew Wilcox <[email protected]>
> cc: [email protected]
> cc: [email protected]
> ---
> crypto/af_alg.c | 28 ++++++++++++++++++++++++++--
> crypto/algif_aead.c | 22 +++++++++++-----------
> crypto/algif_skcipher.c | 8 ++++----
> 3 files changed, 41 insertions(+), 17 deletions(-)
>
> diff --git a/crypto/af_alg.c b/crypto/af_alg.c
> index fd56ccff6fed..62f4205d42e3 100644
> --- a/crypto/af_alg.c
> +++ b/crypto/af_alg.c
> @@ -940,6 +940,10 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
> bool init = false;
> int err = 0;
>
> + if ((msg->msg_flags & MSG_SPLICE_PAGES) &&
> + !iov_iter_is_bvec(&msg->msg_iter))
> + return -EINVAL;
> +
> if (msg->msg_controllen) {
> err = af_alg_cmsg_send(msg, &con);
> if (err)
> @@ -985,7 +989,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
> while (size) {
> struct scatterlist *sg;
> size_t len = size;
> - size_t plen;
> + ssize_t plen;
>
> /* use the existing memory in an allocated page */
> if (ctx->merge) {
> @@ -1030,7 +1034,27 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
> if (sgl->cur)
> sg_unmark_end(sg + sgl->cur - 1);
>
> - if (1 /* TODO check MSG_SPLICE_PAGES */) {
> + if (msg->msg_flags & MSG_SPLICE_PAGES) {
> + struct sg_table sgtable = {
> + .sgl = sg,
> + .nents = sgl->cur,
> + .orig_nents = sgl->cur,
> + };
> +
> + plen = extract_iter_to_sg(&msg->msg_iter, len, &sgtable,
> + MAX_SGL_ENTS, 0);

It looks like the above expect/supports only ITER_BVEC iterators, what
about adding a WARN_ON_ONCE(<other iov type>)?

Also, I'm keeping this series a bit more in pw to allow Herbert or
others to have a look.

Cheers,

Paolo


2023-06-06 08:41:29

by Paolo Abeni

[permalink] [raw]
Subject: Re: [PATCH net-next v2 08/10] crypto: af_alg: Support MSG_SPLICE_PAGES

On Thu, 2023-06-01 at 12:35 +0100, David Howells wrote:
> Paolo Abeni <[email protected]> wrote:
>
> > > + if ((msg->msg_flags & MSG_SPLICE_PAGES) &&
> > > + !iov_iter_is_bvec(&msg->msg_iter))
> > > + return -EINVAL;
> > > +
> > ...
> > It looks like the above expect/supports only ITER_BVEC iterators, what
> > about adding a WARN_ON_ONCE(<other iov type>)?
>
> Meh. I relaxed that requirement as I'm now using tools to extract stuff from
> any iterator (extract_iter_to_sg() in this case) rather than walking the
> bvec[] directly. I forgot to remove the check from af_alg. I can add an
> extra patch to remove it. Also, it probably doesn't matter for AF_ALG since
> that's only likely to be called from userspace, either directly (which will
> not set MSG_SPLICE_PAGES) or via splice (which will pass a BVEC). Internal
> kernel code will use crypto API directly.

Thank you for the clarification, I got lost a bit. The patch LGTM as
is.

>
> > Also, I'm keeping this series a bit more in pw to allow Herbert or
> > others to have a look.

@Herbert, the series LGTM, I think we should apply it. If you have any
concerns, please voice them soon!

Thanks,

Paolo


2023-06-06 09:30:58

by David Howells

[permalink] [raw]
Subject: Re: [PATCH net-next v2 10/10] crypto: af_alg/hash: Support MSG_SPLICE_PAGES

Herbert Xu <[email protected]> wrote:

> > - if (limit > sk->sk_sndbuf)
> > - limit = sk->sk_sndbuf;
> > + /* Don't limit to ALG_MAX_PAGES if the pages are all already pinned. */
> > + if (!user_backed_iter(&msg->msg_iter))
> > + max_pages = INT_MAX;

If the iov_iter is a kernel-backed type (BVEC, KVEC, XARRAY) then (a) all the
pages it refers to must already be pinned in memory and (b) the caller must
have limited it in some way (splice is limited by the pipe capacity, for
instance). In which case, it seems pointless taking more than one pass of the
while loop if we can avoid it - at least from the point of view of memory
handling; granted there might be other criteria such as hogging crypto offload
hardware.

> > + else
> > + max_pages = min_t(size_t, max_pages,
> > + DIV_ROUND_UP(sk->sk_sndbuf, PAGE_SIZE));
>
> What's the purpose of relaxing this limit?

If the iov_iter is a user-backed type (IOVEC or UBUF) then it's not relaxed.
max_pages is ALG_MAX_PAGES here (actually, I should just move that here so
that it's clearer).

I am, however, applying the sk_sndbuf limit here also - there's no point
extracting more pages than we need to if ALG_MAX_PAGES of whole pages would
overrun the byte limit.

> Even if there is a reason for this shouldn't this be in a patch by itself?

I suppose I could do it as a follow-on patch; use ALG_MAX_PAGES and sk_sndbuf
before that as for user-backed iterators.

Actually, is it worth paying attention to sk_sndbuf for kernel-backed
iterators?

David


2023-06-06 09:43:35

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH net-next v2 10/10] crypto: af_alg/hash: Support MSG_SPLICE_PAGES

On Tue, Jun 06, 2023 at 10:24:55AM +0100, David Howells wrote:
>
> If the iov_iter is a user-backed type (IOVEC or UBUF) then it's not relaxed.
> max_pages is ALG_MAX_PAGES here (actually, I should just move that here so
> that it's clearer).

Even if it's kernel memory they can't be freed during the hashing
operation, which could be long if the amount is large (or the algo
is slow).

The reason for the limit here is to stop a malicious user from
pinning an unlimited amount of memory by doing a hashing operation,
IOW a DoS attack.

So I think we should keep the limit as is.

Cheers,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2023-06-06 10:23:14

by David Howells

[permalink] [raw]
Subject: Re: [PATCH net-next v2 10/10] crypto: af_alg/hash: Support MSG_SPLICE_PAGES

Herbert Xu <[email protected]> wrote:

> So I think we should keep the limit as is.

Okay.

David