2023-07-24 06:35:05

by Nitesh Shetty

[permalink] [raw]
Subject: [PATCH] fs/read_write: Enable copy_file_range for block device.

From: Anuj Gupta <[email protected]>

Change generic_copy_file_checks to use ->f_mapping->host for both inode_in
and inode_out. Allow block device in generic_file_rw_checks.

Signed-off-by: Anuj Gupta <[email protected]>
Signed-off-by: Nitesh Shetty <[email protected]>
---
fs/read_write.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index b07de77ef126..eaeb481477f4 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1405,8 +1405,8 @@ static int generic_copy_file_checks(struct file *file_in, loff_t pos_in,
struct file *file_out, loff_t pos_out,
size_t *req_count, unsigned int flags)
{
- struct inode *inode_in = file_inode(file_in);
- struct inode *inode_out = file_inode(file_out);
+ struct inode *inode_in = file_in->f_mapping->host;
+ struct inode *inode_out = file_out->f_mapping->host;
uint64_t count = *req_count;
loff_t size_in;
int ret;
@@ -1708,7 +1708,9 @@ int generic_file_rw_checks(struct file *file_in, struct file *file_out)
/* Don't copy dirs, pipes, sockets... */
if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode))
return -EISDIR;
- if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode))
+ if (!S_ISREG(inode_in->i_mode) && !S_ISBLK(inode_in->i_mode))
+ return -EINVAL;
+ if ((inode_in->i_mode & S_IFMT) != (inode_out->i_mode & S_IFMT))
return -EINVAL;

if (!(file_in->f_mode & FMODE_READ) ||
--
2.25.1



2023-07-24 06:57:01

by Dave Chinner

[permalink] [raw]
Subject: Re: [PATCH] fs/read_write: Enable copy_file_range for block device.

On Mon, Jul 24, 2023 at 11:33:36AM +0530, Nitesh Shetty wrote:
> From: Anuj Gupta <[email protected]>
>
> Change generic_copy_file_checks to use ->f_mapping->host for both inode_in
> and inode_out. Allow block device in generic_file_rw_checks.

Why? copy_file_range() is for copying a range of a regular file to
another regular file - why do we want to support block devices for
somethign that is clearly intended for copying data files?

Also, the copy_file_range man page states:

ERRORS
.....
EINVAL Either fd_in or fd_out is not a regular file.
.....

If we are changing the behavioru of copy_file_range (why?), then man
page updates need to be done as well, documenting the change, which
kernel versions only support regular files, etc.

Cheers,

Dave.
--
Dave Chinner
[email protected]

2023-07-24 16:54:18

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] fs/read_write: Enable copy_file_range for block device.

> > Change generic_copy_file_checks to use ->f_mapping->host for both inode_in
> > and inode_out. Allow block device in generic_file_rw_checks.
>
> Why? copy_file_range() is for copying a range of a regular file to
> another regular file - why do we want to support block devices for
> somethign that is clearly intended for copying data files?

Nitesh has a series to add block layer copy offload, and uses that to
implement copy_file_range on block device nodes, which seems like a
sensible use case for copy_file_range on block device nodes, and that
series was hiding a change like this deep down in a "block" title
patch, so I asked for it to be split out. It still really should
be in that series, as there's very little point in changing this
check without an actual implementation making use of it.


2023-07-24 16:55:26

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] fs/read_write: Enable copy_file_range for block device.

> {
> - struct inode *inode_in = file_inode(file_in);
> - struct inode *inode_out = file_inode(file_out);
> + struct inode *inode_in = file_in->f_mapping->host;
> + struct inode *inode_out = file_out->f_mapping->host;

This doesn't directly have anything to do with block devices, as regular
files can also have a f_mapping that's different. None of the file
systems actually supporting copy offload right now do, but changing
the dereference here is a correctness thing totally independent of
block device support.

2023-07-24 23:02:45

by Dave Chinner

[permalink] [raw]
Subject: Re: [PATCH] fs/read_write: Enable copy_file_range for block device.

On Mon, Jul 24, 2023 at 06:38:38PM +0200, Christoph Hellwig wrote:
> > > Change generic_copy_file_checks to use ->f_mapping->host for both inode_in
> > > and inode_out. Allow block device in generic_file_rw_checks.
> >
> > Why? copy_file_range() is for copying a range of a regular file to
> > another regular file - why do we want to support block devices for
> > somethign that is clearly intended for copying data files?
>
> Nitesh has a series to add block layer copy offload,

Yes, I know.

> and uses that to
> implement copy_file_range on block device nodes,

Yes, I know.

> which seems like a
> sensible use case for copy_file_range on block device nodes,

Except for the fact it's documented and implemented as for copying
data ranges of *regular files*. Block devices are not regular
files...

There is nothing in this patchset that explains why this syscall
feature creep is desired, why it is the best solution, what benefits
it provides, how this new feature is discoverable, etc. It also does
not mention that user facing documentation needs to change, etc

> and that
> series was hiding a change like this deep down in a "block" title
> patch,

I know.

> so I asked for it to be split out. It still really should
> be in that series, as there's very little point in changing this
> check without an actual implementation making use of it.

And that's because it's the wrong way to be implementing block
device copy offloads.

That patchset originally added ioctls to issue block copy offloads
to block devices from userspace - that's the way block device
specific functionality is generally added and I have no problems
with that.

However, when I originally suggested that the generic
copy_file_range() fallback path that filesystems use (i.e.
generic_copy_file_range()) should try to use the block copy offload
path before falling back to using splice to copy the data through
host memory, things went off the rails.

That has turned into "copy_file_range() should support block devices
directly" and the ioctl interfaces were removed. The block copy
offload patchset still doesn't have a generic path for filesystems
to use this block copy offload. This is *not* what I originally
suggested, and provides none of the benefit to users that would come
from what I originally suggested. Block device copy offload is
largely useless to users if file data copies within a filesystem
don't make use of it - very few applications ever copy data directly
to/from block devices directly...

So from a system level point of view, expanding copy_file_range() to
do direct block device data copies doesn't make any sense. Expanding
the existing copy_file_range() generic fallback to attempt block
copy offload (i.e. hardware accel) makes much more sense, and will
make copy_file_range() much more useful to users on filesystem like
ext4 that don't have reflink support...

So, yeah, this patch, regardless of how it is presented, needs to a
whole lot more justification that "we want to do this" in the commit
message...

-Dave.
--
Dave Chinner
[email protected]