2023-02-13 15:34:12

by David Howells

[permalink] [raw]
Subject: [PATCH v2 0/4] iov_iter: Adjust styling/location of new splice functions

Hi Jens, Al, Christoph,

Here are patches to make some changes that Christoph requested[1] to the
new generic file splice functions that I implemented[2]. Apart from one
functional change, they just altering the styling and move one of the
functions to a different file:

(1) Rename the main functions:

generic_file_buffered_splice_read() -> filemap_splice_read()
generic_file_direct_splice_read() -> direct_splice_read()

(2) Abstract out the calculation of the location of the head pipe buffer
into a helper function in linux/pipe_fs_i.h.

(3) Use init_sync_kiocb() in filemap_splice_read().

This is where the functional change is. Some kiocb fields are then
filled in where they were set to 0 before, including setting ki_flags
from f_iocb_flags. I've filtered out IOCB_NOWAIT as the function is
supposed to be synchronous.

(4) Move filemap_splice_read() to mm/filemap.c. filemap_get_pages() can
then be made static again.

I've pushed the patches here also:

https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=iov-extract-3

I've also updated worked the changes into the commits on my iov-extract
branch if that would be preferable, though that means Jens would need to
update his for-6.3/iov-extract again.

David

Link: https://lore.kernel.org/r/Y+n0n2UE8BQa/[email protected]/ [1]
Link: https://lore.kernel.org/r/[email protected]/ [2]

Changes
=======
ver #2)
- Don't attempt to filter IOCB_* flags in filemap_splice_read().

Link: https://lore.kernel.org/r/[email protected]/ # v1

David Howells (4):
splice: Rename new splice functions
splice: Provide pipe_head_buf() helper
splice: Use init_sync_kiocb() in filemap_splice_read()
splice: Move filemap_read_splice() to mm/filemap.c

fs/splice.c | 146 ++------------------------------------
include/linux/pagemap.h | 2 -
include/linux/pipe_fs_i.h | 20 ++++++
include/linux/splice.h | 4 ++
mm/filemap.c | 137 +++++++++++++++++++++++++++++++++--
5 files changed, 162 insertions(+), 147 deletions(-)



2023-02-13 15:34:26

by David Howells

[permalink] [raw]
Subject: [PATCH v2 2/4] splice: Provide pipe_head_buf() helper

Provide a helper, pipe_head_buf(), to get the current head buffer from a
pipe. Implement this as a wrapper around a more general function,
pipe_buf(), that gets a specified buffer.

Requested-by: Christoph Hellwig <[email protected]>
Signed-off-by: David Howells <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
cc: Jens Axboe <[email protected]>
cc: Al Viro <[email protected]>
cc: John Hubbard <[email protected]>
cc: David Hildenbrand <[email protected]>
cc: Matthew Wilcox <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/splice.c | 9 +++------
include/linux/pipe_fs_i.h | 20 ++++++++++++++++++++
2 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/fs/splice.c b/fs/splice.c
index 91b9e2cb9e03..7c0ff187f87a 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -295,7 +295,6 @@ static ssize_t direct_splice_read(struct file *in, loff_t *ppos,
struct bio_vec *bv;
struct kiocb kiocb;
struct page **pages;
- unsigned int head;
ssize_t ret;
size_t used, npages, chunk, remain, reclaim;
int i;
@@ -358,9 +357,8 @@ static ssize_t direct_splice_read(struct file *in, loff_t *ppos,
}

/* Push the remaining pages into the pipe. */
- head = pipe->head;
for (i = 0; i < npages; i++) {
- struct pipe_buffer *buf = &pipe->bufs[head & (pipe->ring_size - 1)];
+ struct pipe_buffer *buf = pipe_head_buf(pipe);

chunk = min_t(size_t, remain, PAGE_SIZE);
*buf = (struct pipe_buffer) {
@@ -369,10 +367,9 @@ static ssize_t direct_splice_read(struct file *in, loff_t *ppos,
.offset = 0,
.len = chunk,
};
- head++;
+ pipe->head++;
remain -= chunk;
}
- pipe->head = head;

kfree(bv);
return ret;
@@ -394,7 +391,7 @@ static size_t splice_folio_into_pipe(struct pipe_inode_info *pipe,

while (spliced < size &&
!pipe_full(pipe->head, pipe->tail, pipe->max_usage)) {
- struct pipe_buffer *buf = &pipe->bufs[pipe->head & (pipe->ring_size - 1)];
+ struct pipe_buffer *buf = pipe_head_buf(pipe);
size_t part = min_t(size_t, PAGE_SIZE - offset, size - spliced);

*buf = (struct pipe_buffer) {
diff --git a/include/linux/pipe_fs_i.h b/include/linux/pipe_fs_i.h
index 6cb65df3e3ba..d2c3f16cf6b1 100644
--- a/include/linux/pipe_fs_i.h
+++ b/include/linux/pipe_fs_i.h
@@ -156,6 +156,26 @@ static inline bool pipe_full(unsigned int head, unsigned int tail,
return pipe_occupancy(head, tail) >= limit;
}

+/**
+ * pipe_buf - Return the pipe buffer for the specified slot in the pipe ring
+ * @pipe: The pipe to access
+ * @slot: The slot of interest
+ */
+static inline struct pipe_buffer *pipe_buf(const struct pipe_inode_info *pipe,
+ unsigned int slot)
+{
+ return &pipe->bufs[slot & (pipe->ring_size - 1)];
+}
+
+/**
+ * pipe_head_buf - Return the pipe buffer at the head of the pipe ring
+ * @pipe: The pipe to access
+ */
+static inline struct pipe_buffer *pipe_head_buf(const struct pipe_inode_info *pipe)
+{
+ return pipe_buf(pipe, pipe->head);
+}
+
/**
* pipe_buf_get - get a reference to a pipe_buffer
* @pipe: the pipe that the buffer belongs to


2023-02-13 15:34:28

by David Howells

[permalink] [raw]
Subject: [PATCH v2 3/4] splice: Use init_sync_kiocb() in filemap_splice_read()

Use init_sync_kiocb() in filemap_splice_read() rather than open coding it.

Requested-by: Christoph Hellwig <[email protected]>
Signed-off-by: David Howells <[email protected]>
cc: Christoph Hellwig <[email protected]>
cc: Jens Axboe <[email protected]>
cc: Al Viro <[email protected]>
cc: John Hubbard <[email protected]>
cc: David Hildenbrand <[email protected]>
cc: Matthew Wilcox <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
---

Notes:
ver #2)
- Don't attempt to filter IOCB_* flags.

fs/splice.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/splice.c b/fs/splice.c
index 7c0ff187f87a..4ea63d6a9040 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -419,15 +419,14 @@ static ssize_t filemap_splice_read(struct file *in, loff_t *ppos,
size_t len, unsigned int flags)
{
struct folio_batch fbatch;
+ struct kiocb iocb;
size_t total_spliced = 0, used, npages;
loff_t isize, end_offset;
bool writably_mapped;
int i, error = 0;

- struct kiocb iocb = {
- .ki_filp = in,
- .ki_pos = *ppos,
- };
+ init_sync_kiocb(&iocb, in);
+ iocb.ki_pos = *ppos;

/* Work out how much data we can actually add into the pipe */
used = pipe_occupancy(pipe->head, pipe->tail);


2023-02-13 15:34:31

by David Howells

[permalink] [raw]
Subject: [PATCH v2 4/4] splice: Move filemap_read_splice() to mm/filemap.c

Move filemap_read_splice() to mm/filemap.c and make filemap_get_pages()
static again.

Requested-by: Christoph Hellwig <[email protected]>
Signed-off-by: David Howells <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
cc: Jens Axboe <[email protected]>
cc: Al Viro <[email protected]>
cc: John Hubbard <[email protected]>
cc: David Hildenbrand <[email protected]>
cc: Matthew Wilcox <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
---
fs/splice.c | 127 -------------------------------------
include/linux/pagemap.h | 2 -
include/linux/splice.h | 4 ++
mm/filemap.c | 137 ++++++++++++++++++++++++++++++++++++++--
4 files changed, 135 insertions(+), 135 deletions(-)

diff --git a/fs/splice.c b/fs/splice.c
index 4ea63d6a9040..341cd8fb47a8 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -375,133 +375,6 @@ static ssize_t direct_splice_read(struct file *in, loff_t *ppos,
return ret;
}

-/*
- * Splice subpages from a folio into a pipe.
- */
-static size_t splice_folio_into_pipe(struct pipe_inode_info *pipe,
- struct folio *folio,
- loff_t fpos, size_t size)
-{
- struct page *page;
- size_t spliced = 0, offset = offset_in_folio(folio, fpos);
-
- page = folio_page(folio, offset / PAGE_SIZE);
- size = min(size, folio_size(folio) - offset);
- offset %= PAGE_SIZE;
-
- while (spliced < size &&
- !pipe_full(pipe->head, pipe->tail, pipe->max_usage)) {
- struct pipe_buffer *buf = pipe_head_buf(pipe);
- size_t part = min_t(size_t, PAGE_SIZE - offset, size - spliced);
-
- *buf = (struct pipe_buffer) {
- .ops = &page_cache_pipe_buf_ops,
- .page = page,
- .offset = offset,
- .len = part,
- };
- folio_get(folio);
- pipe->head++;
- page++;
- spliced += part;
- offset = 0;
- }
-
- return spliced;
-}
-
-/*
- * Splice folios from the pagecache of a buffered (ie. non-O_DIRECT) file into
- * a pipe.
- */
-static ssize_t filemap_splice_read(struct file *in, loff_t *ppos,
- struct pipe_inode_info *pipe,
- size_t len, unsigned int flags)
-{
- struct folio_batch fbatch;
- struct kiocb iocb;
- size_t total_spliced = 0, used, npages;
- loff_t isize, end_offset;
- bool writably_mapped;
- int i, error = 0;
-
- init_sync_kiocb(&iocb, in);
- iocb.ki_pos = *ppos;
-
- /* Work out how much data we can actually add into the pipe */
- used = pipe_occupancy(pipe->head, pipe->tail);
- npages = max_t(ssize_t, pipe->max_usage - used, 0);
- len = min_t(size_t, len, npages * PAGE_SIZE);
-
- folio_batch_init(&fbatch);
-
- do {
- cond_resched();
-
- if (*ppos >= i_size_read(file_inode(in)))
- break;
-
- iocb.ki_pos = *ppos;
- error = filemap_get_pages(&iocb, len, &fbatch, true);
- if (error < 0)
- break;
-
- /*
- * i_size must be checked after we know the pages are Uptodate.
- *
- * Checking i_size after the check allows us to calculate
- * the correct value for "nr", which means the zero-filled
- * part of the page is not copied back to userspace (unless
- * another truncate extends the file - this is desired though).
- */
- isize = i_size_read(file_inode(in));
- if (unlikely(*ppos >= isize))
- break;
- end_offset = min_t(loff_t, isize, *ppos + len);
-
- /*
- * Once we start copying data, we don't want to be touching any
- * cachelines that might be contended:
- */
- writably_mapped = mapping_writably_mapped(in->f_mapping);
-
- for (i = 0; i < folio_batch_count(&fbatch); i++) {
- struct folio *folio = fbatch.folios[i];
- size_t n;
-
- if (folio_pos(folio) >= end_offset)
- goto out;
- folio_mark_accessed(folio);
-
- /*
- * If users can be writing to this folio using arbitrary
- * virtual addresses, take care of potential aliasing
- * before reading the folio on the kernel side.
- */
- if (writably_mapped)
- flush_dcache_folio(folio);
-
- n = splice_folio_into_pipe(pipe, folio, *ppos, len);
- if (!n)
- goto out;
- len -= n;
- total_spliced += n;
- *ppos += n;
- in->f_ra.prev_pos = *ppos;
- if (pipe_full(pipe->head, pipe->tail, pipe->max_usage))
- goto out;
- }
-
- folio_batch_release(&fbatch);
- } while (len);
-
-out:
- folio_batch_release(&fbatch);
- file_accessed(in);
-
- return total_spliced ? total_spliced : error;
-}
-
/**
* generic_file_splice_read - splice data from file to a pipe
* @in: file to splice from
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 3a7bdb35acff..29e1f9e76eb6 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -748,8 +748,6 @@ struct page *read_cache_page(struct address_space *, pgoff_t index,
filler_t *filler, struct file *file);
extern struct page * read_cache_page_gfp(struct address_space *mapping,
pgoff_t index, gfp_t gfp_mask);
-int filemap_get_pages(struct kiocb *iocb, size_t count,
- struct folio_batch *fbatch, bool need_uptodate);

static inline struct page *read_mapping_page(struct address_space *mapping,
pgoff_t index, struct file *file)
diff --git a/include/linux/splice.h b/include/linux/splice.h
index a55179fd60fc..691c44ef5c0b 100644
--- a/include/linux/splice.h
+++ b/include/linux/splice.h
@@ -67,6 +67,10 @@ typedef int (splice_actor)(struct pipe_inode_info *, struct pipe_buffer *,
typedef int (splice_direct_actor)(struct pipe_inode_info *,
struct splice_desc *);

+ssize_t filemap_splice_read(struct file *in, loff_t *ppos,
+ struct pipe_inode_info *pipe,
+ size_t len, unsigned int flags);
+
extern ssize_t splice_from_pipe(struct pipe_inode_info *, struct file *,
loff_t *, size_t, unsigned int,
splice_actor *);
diff --git a/mm/filemap.c b/mm/filemap.c
index 6970be64a3e0..e1ee267675d2 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -42,6 +42,8 @@
#include <linux/ramfs.h>
#include <linux/page_idle.h>
#include <linux/migrate.h>
+#include <linux/pipe_fs_i.h>
+#include <linux/splice.h>
#include <asm/pgalloc.h>
#include <asm/tlbflush.h>
#include "internal.h"
@@ -2576,12 +2578,8 @@ static int filemap_readahead(struct kiocb *iocb, struct file *file,
return 0;
}

-/*
- * Extract some folios from the pagecache of a file, reading those pages from
- * the backing store if necessary and waiting for them.
- */
-int filemap_get_pages(struct kiocb *iocb, size_t count,
- struct folio_batch *fbatch, bool need_uptodate)
+static int filemap_get_pages(struct kiocb *iocb, size_t count,
+ struct folio_batch *fbatch, bool need_uptodate)
{
struct file *filp = iocb->ki_filp;
struct address_space *mapping = filp->f_mapping;
@@ -2845,6 +2843,133 @@ generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
}
EXPORT_SYMBOL(generic_file_read_iter);

+/*
+ * Splice subpages from a folio into a pipe.
+ */
+static size_t splice_folio_into_pipe(struct pipe_inode_info *pipe,
+ struct folio *folio,
+ loff_t fpos, size_t size)
+{
+ struct page *page;
+ size_t spliced = 0, offset = offset_in_folio(folio, fpos);
+
+ page = folio_page(folio, offset / PAGE_SIZE);
+ size = min(size, folio_size(folio) - offset);
+ offset %= PAGE_SIZE;
+
+ while (spliced < size &&
+ !pipe_full(pipe->head, pipe->tail, pipe->max_usage)) {
+ struct pipe_buffer *buf = pipe_head_buf(pipe);
+ size_t part = min_t(size_t, PAGE_SIZE - offset, size - spliced);
+
+ *buf = (struct pipe_buffer) {
+ .ops = &page_cache_pipe_buf_ops,
+ .page = page,
+ .offset = offset,
+ .len = part,
+ };
+ folio_get(folio);
+ pipe->head++;
+ page++;
+ spliced += part;
+ offset = 0;
+ }
+
+ return spliced;
+}
+
+/*
+ * Splice folios from the pagecache of a buffered (ie. non-O_DIRECT) file into
+ * a pipe.
+ */
+ssize_t filemap_splice_read(struct file *in, loff_t *ppos,
+ struct pipe_inode_info *pipe,
+ size_t len, unsigned int flags)
+{
+ struct folio_batch fbatch;
+ struct kiocb iocb;
+ size_t total_spliced = 0, used, npages;
+ loff_t isize, end_offset;
+ bool writably_mapped;
+ int i, error = 0;
+
+ init_sync_kiocb(&iocb, in);
+ iocb.ki_pos = *ppos;
+
+ /* Work out how much data we can actually add into the pipe */
+ used = pipe_occupancy(pipe->head, pipe->tail);
+ npages = max_t(ssize_t, pipe->max_usage - used, 0);
+ len = min_t(size_t, len, npages * PAGE_SIZE);
+
+ folio_batch_init(&fbatch);
+
+ do {
+ cond_resched();
+
+ if (*ppos >= i_size_read(file_inode(in)))
+ break;
+
+ iocb.ki_pos = *ppos;
+ error = filemap_get_pages(&iocb, len, &fbatch, true);
+ if (error < 0)
+ break;
+
+ /*
+ * i_size must be checked after we know the pages are Uptodate.
+ *
+ * Checking i_size after the check allows us to calculate
+ * the correct value for "nr", which means the zero-filled
+ * part of the page is not copied back to userspace (unless
+ * another truncate extends the file - this is desired though).
+ */
+ isize = i_size_read(file_inode(in));
+ if (unlikely(*ppos >= isize))
+ break;
+ end_offset = min_t(loff_t, isize, *ppos + len);
+
+ /*
+ * Once we start copying data, we don't want to be touching any
+ * cachelines that might be contended:
+ */
+ writably_mapped = mapping_writably_mapped(in->f_mapping);
+
+ for (i = 0; i < folio_batch_count(&fbatch); i++) {
+ struct folio *folio = fbatch.folios[i];
+ size_t n;
+
+ if (folio_pos(folio) >= end_offset)
+ goto out;
+ folio_mark_accessed(folio);
+
+ /*
+ * If users can be writing to this folio using arbitrary
+ * virtual addresses, take care of potential aliasing
+ * before reading the folio on the kernel side.
+ */
+ if (writably_mapped)
+ flush_dcache_folio(folio);
+
+ n = splice_folio_into_pipe(pipe, folio, *ppos, len);
+ if (!n)
+ goto out;
+ len -= n;
+ total_spliced += n;
+ *ppos += n;
+ in->f_ra.prev_pos = *ppos;
+ if (pipe_full(pipe->head, pipe->tail, pipe->max_usage))
+ goto out;
+ }
+
+ folio_batch_release(&fbatch);
+ } while (len);
+
+out:
+ folio_batch_release(&fbatch);
+ file_accessed(in);
+
+ return total_spliced ? total_spliced : error;
+}
+
static inline loff_t folio_seek_hole_data(struct xa_state *xas,
struct address_space *mapping, struct folio *folio,
loff_t start, loff_t end, bool seek_data)


2023-02-13 15:49:50

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] splice: Provide pipe_head_buf() helper

On 13.02.23 16:32, David Howells wrote:
> Provide a helper, pipe_head_buf(), to get the current head buffer from a
> pipe. Implement this as a wrapper around a more general function,
> pipe_buf(), that gets a specified buffer.
>
> Requested-by: Christoph Hellwig <[email protected]>
> Signed-off-by: David Howells <[email protected]>
> Reviewed-by: Christoph Hellwig <[email protected]>
> cc: Jens Axboe <[email protected]>
> cc: Al Viro <[email protected]>
> cc: John Hubbard <[email protected]>
> cc: David Hildenbrand <[email protected]>
> cc: Matthew Wilcox <[email protected]>
> cc: [email protected]
> cc: [email protected]
> cc: [email protected]
> ---

Reviewed-by: David Hildenbrand <[email protected]>

--
Thanks,

David / dhildenb


2023-02-13 15:51:46

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] splice: Use init_sync_kiocb() in filemap_splice_read()

On 13.02.23 16:33, David Howells wrote:
> Use init_sync_kiocb() in filemap_splice_read() rather than open coding it.
>
> Requested-by: Christoph Hellwig <[email protected]>
> Signed-off-by: David Howells <[email protected]>
> cc: Christoph Hellwig <[email protected]>
> cc: Jens Axboe <[email protected]>
> cc: Al Viro <[email protected]>
> cc: John Hubbard <[email protected]>
> cc: David Hildenbrand <[email protected]>
> cc: Matthew Wilcox <[email protected]>
> cc: [email protected]
> cc: [email protected]
> cc: [email protected]
> ---

Reviewed-by: David Hildenbrand <[email protected]>

--
Thanks,

David / dhildenb


2023-02-13 15:53:28

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] splice: Move filemap_read_splice() to mm/filemap.c

On 13.02.23 16:33, David Howells wrote:
> Move filemap_read_splice() to mm/filemap.c and make filemap_get_pages()
> static again.
>
> Requested-by: Christoph Hellwig <[email protected]>
> Signed-off-by: David Howells <[email protected]>
> Reviewed-by: Christoph Hellwig <[email protected]>
> cc: Jens Axboe <[email protected]>
> cc: Al Viro <[email protected]>
> cc: John Hubbard <[email protected]>
> cc: David Hildenbrand <[email protected]>
> cc: Matthew Wilcox <[email protected]>
> cc: [email protected]
> cc: [email protected]
> cc: [email protected]
> ---

Reviewed-by: David Hildenbrand <[email protected]>

--
Thanks,

David / dhildenb


2023-02-13 17:05:04

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] iov_iter: Adjust styling/location of new splice functions

On 2/13/23 8:32 AM, David Howells wrote:
> Hi Jens, Al, Christoph,
>
> Here are patches to make some changes that Christoph requested[1] to the
> new generic file splice functions that I implemented[2]. Apart from one
> functional change, they just altering the styling and move one of the
> functions to a different file:
>
> (1) Rename the main functions:
>
> generic_file_buffered_splice_read() -> filemap_splice_read()
> generic_file_direct_splice_read() -> direct_splice_read()
>
> (2) Abstract out the calculation of the location of the head pipe buffer
> into a helper function in linux/pipe_fs_i.h.
>
> (3) Use init_sync_kiocb() in filemap_splice_read().
>
> This is where the functional change is. Some kiocb fields are then
> filled in where they were set to 0 before, including setting ki_flags
> from f_iocb_flags. I've filtered out IOCB_NOWAIT as the function is
> supposed to be synchronous.
>
> (4) Move filemap_splice_read() to mm/filemap.c. filemap_get_pages() can
> then be made static again.
>
> I've pushed the patches here also:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=iov-extract-3
>
> I've also updated worked the changes into the commits on my iov-extract
> branch if that would be preferable, though that means Jens would need to
> update his for-6.3/iov-extract again.

Honestly, it's getting tight on timing at this point, and there's also
a crash report from today:

https://lore.kernel.org/linux-block/[email protected]/

I think we'd be better off folding in this series and then potentially
pushing this series to 6.4 rather than 6.3.

--
Jens Axboe