(Originally sent on 2023-10-16 as
<[email protected]>;
received no replies, resending unchanged per
Documentation/process/submitting-patches.rst#_resend_reminders).
Hi!
As it stands, splice(file -> pipe):
1. locks the pipe,
2. does a read from the file,
3. unlocks the pipe.
For reading from regular files and blcokdevs this makes no difference.
But if the file is a tty or a socket, for example, this means that until
data appears, which it may never do, every process trying to read from
or open the pipe enters an uninterruptible sleep,
and will only exit it if the splicing process is killed.
This trivially denies service to:
* any hypothetical pipe-based log collexion system
* all nullmailer installations
* me, personally, when I'm pasting stuff into qemu -serial chardev:pipe
This follows:
1. https://lore.kernel.org/linux-fsdevel/qk6hjuam54khlaikf2ssom6custxf5is2ekkaequf4hvode3ls@zgf7j5j4ubvw/t/#u
2. a security@ thread rooted in
<irrrblivicfc7o3lfq7yjm2lrxq35iyya4gyozlohw24gdzyg7@azmluufpdfvu>
3. https://nabijaczleweli.xyz/content/blogn_t/011-linux-splice-exclusion.html
Patches were posted and then discarded on principle or funxionality,
all in all terminating in Linus posting
> But it is possible that we need to just bite the bullet and say
> "copy_splice_read() needs to use a non-blocking kiocb for the IO".
This does that, effectively making splice(file -> pipe)
request (and require) O_NONBLOCK on reads fron the file:
this doesn't affect splicing from regular files and blockdevs,
since they're always non-blocking
(and requesting the stronger "no kernel sleep" IOCB_NOWAIT is non-sensical),
but always returns -EINVAL for ttys.
Sockets behave as expected from O_NONBLOCK reads:
splice if there's data available else -EAGAIN.
This should all pretty much behave as-expected.
Mostly a re-based version of the summary diff from
<gnj4drf7llod4voaaasoh5jdlq545gduishrbc3ql3665pw7qy@ytd5ykxc4gsr>.
Bisexion yields commit 8924feff66f35fe22ce77aafe3f21eb8e5cff881
("splice: lift pipe_lock out of splice_to_pipe()") as first bad.
The patchset is made quite wide due to the many implementations
of the splice_read callback, and was based entirely on results from
$ git grep '\.splice_read.*=' | cut -d= -f2 |
tr -s ',;[:space:]' '\n' | sort -u
I'm assuming this is exhaustive, but it's 27 distinct implementations.
Of these, I've classified these as trivial delegating wrappers:
nfs_file_splice_read filemap_splice_read
afs_file_splice_read filemap_splice_read
ceph_splice_read filemap_splice_read
ecryptfs_splice_read_update_atime filemap_splice_read
ext4_file_splice_read filemap_splice_read
f2fs_file_splice_read filemap_splice_read
ntfs_file_splice_read filemap_splice_read
ocfs2_file_splice_read filemap_splice_read
orangefs_file_splice_read filemap_splice_read
v9fs_file_splice_read filemap_splice_read
xfs_file_splice_read filemap_splice_read
zonefs_file_splice_read filemap_splice_read
sock_splice_read copy_splice_read or a socket-specific one
coda_file_splice_read vfs_splice_read
ovl_splice_read vfs_splice_read
filemap_splice_read() is used for regular files and blockdevs,
and thus needs no changes, and is thus unchanged.
vfs_splice_read() delegates to copy_splice_read() or f_op->splice_read().
The rest are fixed, in patch order:
01. copy_splice_read() by simply doing the I/O with IOCB_NOWAIT;
diff from Linus:
https://lore.kernel.org/lkml/5osglsw36dla3mubtpsmdwdid4fsdacplyd6acx2igo4atogdg@yur3idyim3cc/t/#ee67de5a9ec18886c434113637d7eff6cd7acac4b
02. unix_stream_splice_read() by unconditionally passing MSG_DONTWAIT
03. fuse_dev_splice_read() by behaving as-if O_NONBLOCK
04. tracing_buffers_splice_read() by behaving as-if O_NONBLOCK
(this also removes the retry loop)
05. relay_file_splice_read() by behaving as-if SPLICE_F_NONBLOCK
(this just means EAGAINing unconditionally for an empty transfer)
06. smc_splice_read() by unconditionally passing MSG_DONTWAIT
07. kcm_splice_read() by unconditionally passing MSG_DONTWAIT
08. tls_sw_splice_read() by behaving as-if SPLICE_F_NONBLOCK
09. tcp_splice_read() by behaving as-if O_NONBLOCK
(this also removes the retry loop)
10. EINVALs on files that neither have FMODE_NOWAIT nor are S_ISREG
We don't want this to be just FMODE_NOWAIT since most regular files
don't have it set and that's not the right semantic anyway,
as noted at the top of this mail,
But this allows blockdevs "by accident", effectively,
since they have FMODE_NOWAIT (at least the ones I tried),
even though they're just like regular files:
handled by filemap_splice_read(),
thus not dispatched with IOCB_NOWAIT. since always non-blocking.
Should this be a check for FMODE_NOWAIT && (S_ISREG || S_ISBLK)?
Should it remain FMODE_NOWAIT && S_ISREG?
Is there an even better way of spelling this?
In net/kcm, this also fixes kcm_splice_read() passing SPLICE_F_*-style
flags to skb_recv_datagram(), which takes MSG_*-style flags.
I don't think they did anything anyway? But.
I would of course be remiss to not analyse splice(pipe -> file) as well:
gfs2_file_splice_write iter_file_splice_write
ovl_splice_write iter_file_splice_write
splice_write_null splice_from_pipe(pipe_to_null), does nothing
fuse_dev_splice_write() locks, copies the iovs, unlocks, does I/O,
locks, frees the pipe's iovs, unlocks
port_fops_splice_write() locks, steals or copies pages, unlocks, does I/O
11. splice_to_socket():
has sock_sendmsg() inside the pipe lock;
filling the socket buffer sleeps in splice with the pipe locked,
and this is trivial to trigger with
./af_unix_ptf ./splicing-cat < fifo &
cat > fifo &
cp 64k fifo a couple times
patch does unconditional MSG_DONTWAIT, tests sensibly
iter_file_splice_write():
has vfs_iter_write() inside the pipe lock,
but appears to be attached to regular files and blockdevs,
but also random_fops & urandom_fops (looks like not an issue)
and tty_fops & console_fops
(this only means non-pty ttys so no issue with a full buffer?
idk if there's a situation where a tty or the discipline can block forever
or if it's guaranteed forward progress, however slow?
still kinda ass to have the pipe lock hard-held for, say,
(64*1024)/(300/8)s=30min if the pipe has 64k in the buffer?
this predixion aligns precisely with what I measured:
1# stty 300 < /dev/ttyS0
1# ./splicing-cat < fifo > /dev/ttyS0
2$ cat > fifo # and typing works
3$ cp 64k fifo # uninterrupitbly sleeps in write(4, "SzmprOmdIIkciMwbpxhsEyFVORaPGbRQ"..., 66560
1: now sleeping in splice
2: typing more into the cat uninterruptibly sleeps in write
4$ : > /tmp/fifo # uninterruptibly hangs in open
similarly, "cp 10k fifo" uninterruptibly sleeps in close,
with the same effects on other (potential) writers,
and woke up after around five minutes, which matches my maths
so presumably something should be done about this as well?
just idk what)
So. AFAIK, just iter_file_splice_write() on ttys remains.
This needs a man-pages patch as well,
but I'd go rabid if I were to write it rn.
For the samples above, af_unix_ptf.c:
-- >8 --
#include <stdio.h>
#include <stdlib.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <sys/un.h>
#include <unistd.h>
int main(int argc, char ** argv) {
int fds[2];
if(socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, fds))
abort();
if(!vfork()) {
dup2(fds[1], 1);
_exit(execvp(argv[1], argv + 1));
}
dup2(fds[0], 0);
for(;;) {
char buf[16];
int r = read(0, buf, 16);
fprintf(stderr, "read %d\n", r);
sleep(10);
}
}
-- >8 --
splicing-cat.c:
-- >8 --
#define _GNU_SOURCE
#include <fcntl.h>
#include <stdio.h>
#include <errno.h>
int main() {
int lasterr = -1;
unsigned ctr = 0;
for(;;) {
errno = 0;
ssize_t ret = splice(0, 0, 1, 0, 128 * 1024 * 1024, 0);
if(ret >= 0 || errno != lasterr) {
fprintf(stderr, "\n\t%m" + (lasterr == -1));
lasterr = errno;
ctr = 0;
}
if(ret == -1) {
++ctr;
fprintf(stderr, "\r%u", ctr);
} else
fprintf(stderr, "\r%zu", ret);
if(!ret)
break;
}
fprintf(stderr, "\n");
}
-- >8 --
Ahelenia Ziemiańska (11):
splice: copy_splice_read: do the I/O with IOCB_NOWAIT
af_unix: unix_stream_splice_read: always request MSG_DONTWAIT
fuse: fuse_dev_splice_read: use nonblocking I/O
tracing: tracing_buffers_splice_read: behave as-if non-blocking I/O
relayfs: relay_file_splice_read: always return -EAGAIN for no data
net/smc: smc_splice_read: always request MSG_DONTWAIT
kcm: kcm_splice_read: always request MSG_DONTWAIT
tls/sw: tls_sw_splice_read: always request non-blocking I/O
net/tcp: tcp_splice_read: always do non-blocking reads
splice: file->pipe: -EINVAL for non-regular files w/o FMODE_NOWAIT
splice: splice_to_socket: always request MSG_DONTWAIT
fs/fuse/dev.c | 10 ++++++----
fs/splice.c | 7 ++++---
kernel/relay.c | 3 +--
kernel/trace/trace.c | 32 ++++----------------------------
net/ipv4/tcp.c | 30 +++---------------------------
net/kcm/kcmsock.c | 2 +-
net/smc/af_smc.c | 6 +-----
net/tls/tls_sw.c | 5 ++---
net/unix/af_unix.c | 5 +----
9 files changed, 23 insertions(+), 77 deletions(-)
base-commit: 58720809f52779dc0f08e53e54b014209d13eebb
--
2.39.2
Otherwise we risk sleeping with the pipe locked for indeterminate
lengths of time.
Link: https://lore.kernel.org/linux-fsdevel/qk6hjuam54khlaikf2ssom6custxf5is2ekkaequf4hvode3ls@zgf7j5j4ubvw/t/#u
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
---
kernel/trace/trace.c | 32 ++++----------------------------
1 file changed, 4 insertions(+), 28 deletions(-)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index abaaf516fcae..9be7a4c0a3ca 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -8477,7 +8477,6 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos,
if (splice_grow_spd(pipe, &spd))
return -ENOMEM;
- again:
trace_access_lock(iter->cpu_file);
entries = ring_buffer_entries_cpu(iter->array_buffer->buffer, iter->cpu_file);
@@ -8528,35 +8527,12 @@ tracing_buffers_splice_read(struct file *file, loff_t *ppos,
/* did we read anything? */
if (!spd.nr_pages) {
- long wait_index;
-
- if (ret)
- goto out;
-
- ret = -EAGAIN;
- if ((file->f_flags & O_NONBLOCK) || (flags & SPLICE_F_NONBLOCK))
- goto out;
-
- wait_index = READ_ONCE(iter->wait_index);
-
- ret = wait_on_pipe(iter, iter->tr->buffer_percent);
- if (ret)
- goto out;
-
- /* No need to wait after waking up when tracing is off */
- if (!tracer_tracing_is_on(iter->tr))
- goto out;
-
- /* Make sure we see the new wait_index */
- smp_rmb();
- if (wait_index != iter->wait_index)
- goto out;
-
- goto again;
+ if (!ret)
+ ret = -EAGAIN;
+ } else {
+ ret = splice_to_pipe(pipe, &spd);
}
- ret = splice_to_pipe(pipe, &spd);
-out:
splice_shrink_spd(&spd);
return ret;
--
2.39.2
Otherwise we risk sleeping with the pipe locked for indeterminate
lengths of time.
Link: https://lore.kernel.org/linux-fsdevel/qk6hjuam54khlaikf2ssom6custxf5is2ekkaequf4hvode3ls@zgf7j5j4ubvw/t/#u
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
---
fs/fuse/dev.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 1a8f82f478cb..4e8caf66c01e 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -1202,7 +1202,8 @@ __releases(fiq->lock)
* the 'sent' flag.
*/
static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
- struct fuse_copy_state *cs, size_t nbytes)
+ struct fuse_copy_state *cs, size_t nbytes,
+ bool nonblock)
{
ssize_t err;
struct fuse_conn *fc = fud->fc;
@@ -1238,7 +1239,7 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
break;
spin_unlock(&fiq->lock);
- if (file->f_flags & O_NONBLOCK)
+ if (nonblock)
return -EAGAIN;
err = wait_event_interruptible_exclusive(fiq->waitq,
!fiq->connected || request_pending(fiq));
@@ -1364,7 +1365,8 @@ static ssize_t fuse_dev_read(struct kiocb *iocb, struct iov_iter *to)
fuse_copy_init(&cs, 1, to);
- return fuse_dev_do_read(fud, file, &cs, iov_iter_count(to));
+ return fuse_dev_do_read(fud, file, &cs, iov_iter_count(to),
+ file->f_flags & O_NONBLOCK);
}
static ssize_t fuse_dev_splice_read(struct file *in, loff_t *ppos,
@@ -1388,7 +1390,7 @@ static ssize_t fuse_dev_splice_read(struct file *in, loff_t *ppos,
fuse_copy_init(&cs, 1, NULL);
cs.pipebufs = bufs;
cs.pipe = pipe;
- ret = fuse_dev_do_read(fud, in, &cs, len);
+ ret = fuse_dev_do_read(fud, in, &cs, len, true);
if (ret < 0)
goto out;
--
2.39.2
For consistency with the new "file->pipe reads non-blockingly" semantic.
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
---
kernel/relay.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/kernel/relay.c b/kernel/relay.c
index 83fe0325cde1..3d381e94a204 100644
--- a/kernel/relay.c
+++ b/kernel/relay.c
@@ -1215,8 +1215,7 @@ static ssize_t relay_file_splice_read(struct file *in,
if (ret < 0)
break;
else if (!ret) {
- if (flags & SPLICE_F_NONBLOCK)
- ret = -EAGAIN;
+ ret = -EAGAIN;
break;
}
--
2.39.2
Otherwise we risk sleeping with the pipe locked for indeterminate
lengths of time.
Link: https://lore.kernel.org/linux-fsdevel/qk6hjuam54khlaikf2ssom6custxf5is2ekkaequf4hvode3ls@zgf7j5j4ubvw/t/#u
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
---
fs/splice.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/splice.c b/fs/splice.c
index d983d375ff11..9d29664f23ee 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -361,6 +361,7 @@ ssize_t copy_splice_read(struct file *in, loff_t *ppos,
iov_iter_bvec(&to, ITER_DEST, bv, npages, len);
init_sync_kiocb(&kiocb, in);
kiocb.ki_pos = *ppos;
+ kiocb.ki_flags |= IOCB_NOWAIT;
ret = call_read_iter(in, &kiocb, &to);
if (ret > 0) {
--
2.39.2
We request non-blocking I/O in the generic implementation, but some
files ‒ ttys ‒ only check O_NONBLOCK. Refuse them here, lest we
risk sleeping with the pipe locked for indeterminate lengths of
time.
This also masks inconsistent wake-ups (usually every second line)
when splicing from ttys in icanon mode.
Regular files don't /have/ a distinct O_NONBLOCK mode,
because they always behave non-blockingly, and for them FMODE_NOWAIT is
used in the purest sense of
/* File is capable of returning -EAGAIN if I/O will block */
which is not set by the vast majority of filesystems,
and it's not the semantic we want here.
Link: https://lore.kernel.org/linux-fsdevel/qk6hjuam54khlaikf2ssom6custxf5is2ekkaequf4hvode3ls@zgf7j5j4ubvw/t/#u
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
---
fs/splice.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/splice.c b/fs/splice.c
index 9d29664f23ee..81788bf7daa1 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1300,6 +1300,8 @@ long do_splice(struct file *in, loff_t *off_in, struct file *out,
} else if (opipe) {
if (off_out)
return -ESPIPE;
+ if (!((in->f_mode & FMODE_NOWAIT) || S_ISREG(in->f_inode->i_mode)))
+ return -EINVAL;
if (off_in) {
if (!(in->f_mode & FMODE_PREAD))
return -EINVAL;
--
2.39.2
Otherwise we risk sleeping with the pipe locked for indeterminate
lengths of time.
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
---
fs/splice.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/fs/splice.c b/fs/splice.c
index 81788bf7daa1..d5885032f9a8 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -869,13 +869,11 @@ ssize_t splice_to_socket(struct pipe_inode_info *pipe, struct file *out,
if (!bc)
break;
- msg.msg_flags = MSG_SPLICE_PAGES;
+ msg.msg_flags = MSG_SPLICE_PAGES | MSG_DONTWAIT;
if (flags & SPLICE_F_MORE)
msg.msg_flags |= MSG_MORE;
if (remain && pipe_occupancy(pipe->head, tail) > 0)
msg.msg_flags |= MSG_MORE;
- if (out->f_flags & O_NONBLOCK)
- msg.msg_flags |= MSG_DONTWAIT;
iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, bvec, bc,
len - remain);
--
2.39.2
Otherwise we risk sleeping with the pipe locked for indeterminate
lengths of time.
Link: https://lore.kernel.org/linux-fsdevel/qk6hjuam54khlaikf2ssom6custxf5is2ekkaequf4hvode3ls@zgf7j5j4ubvw/t/#u
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
---
net/smc/af_smc.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index bacdd971615e..89473305f629 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -3243,12 +3243,8 @@ static ssize_t smc_splice_read(struct socket *sock, loff_t *ppos,
rc = -ESPIPE;
goto out;
}
- if (flags & SPLICE_F_NONBLOCK)
- flags = MSG_DONTWAIT;
- else
- flags = 0;
SMC_STAT_INC(smc, splice_cnt);
- rc = smc_rx_recvmsg(smc, NULL, pipe, len, flags);
+ rc = smc_rx_recvmsg(smc, NULL, pipe, len, MSG_DONTWAIT);
}
out:
release_sock(sk);
--
2.39.2
Otherwise we risk sleeping with the pipe locked for indeterminate
lengths of time.
Link: https://lore.kernel.org/linux-fsdevel/qk6hjuam54khlaikf2ssom6custxf5is2ekkaequf4hvode3ls@zgf7j5j4ubvw/t/#u
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
---
net/unix/af_unix.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 3e8a04a13668..9489b9bda753 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -2889,15 +2889,12 @@ static ssize_t unix_stream_splice_read(struct socket *sock, loff_t *ppos,
.pipe = pipe,
.size = size,
.splice_flags = flags,
+ .flags = MSG_DONTWAIT,
};
if (unlikely(*ppos))
return -ESPIPE;
- if (sock->file->f_flags & O_NONBLOCK ||
- flags & SPLICE_F_NONBLOCK)
- state.flags = MSG_DONTWAIT;
-
return unix_stream_read_generic(&state, false);
}
--
2.39.2
Otherwise we risk sleeping with the pipe locked for indeterminate
lengths of time.
Also: don't pass the SPLICE_F_*-style flags argument to
skb_recv_datagram(), which expects MSG_*-style flags.
This fixes SPLICE_F_NONBLOCK not having worked.
Link: https://lore.kernel.org/linux-fsdevel/qk6hjuam54khlaikf2ssom6custxf5is2ekkaequf4hvode3ls@zgf7j5j4ubvw/t/#u
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
---
net/kcm/kcmsock.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
index dd1d8ffd5f59..de70156869e6 100644
--- a/net/kcm/kcmsock.c
+++ b/net/kcm/kcmsock.c
@@ -1028,7 +1028,7 @@ static ssize_t kcm_splice_read(struct socket *sock, loff_t *ppos,
/* Only support splice for SOCKSEQPACKET */
- skb = skb_recv_datagram(sk, flags, &err);
+ skb = skb_recv_datagram(sk, MSG_DONTWAIT, &err);
if (!skb)
goto err_out;
--
2.39.2
Otherwise we risk sleeping with the pipe locked for indeterminate
lengths of time.
Link: https://lore.kernel.org/linux-fsdevel/qk6hjuam54khlaikf2ssom6custxf5is2ekkaequf4hvode3ls@zgf7j5j4ubvw/t/#u
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
---
net/tls/tls_sw.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index d1fc295b83b5..73d88c6739e8 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -2145,7 +2145,7 @@ ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
int chunk;
int err;
- err = tls_rx_reader_lock(sk, ctx, flags & SPLICE_F_NONBLOCK);
+ err = tls_rx_reader_lock(sk, ctx, true);
if (err < 0)
return err;
@@ -2154,8 +2154,7 @@ ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
} else {
struct tls_decrypt_arg darg;
- err = tls_rx_rec_wait(sk, NULL, flags & SPLICE_F_NONBLOCK,
- true);
+ err = tls_rx_rec_wait(sk, NULL, true, true);
if (err <= 0)
goto splice_read_end;
--
2.39.2
Otherwise we risk sleeping with the pipe locked for indeterminate
lengths of time.
sock_rcvtimeo() returns 0 if the second argument is true, so the
explicit re-try loop for empty read conditions can be removed
entirely.
Link: https://lore.kernel.org/linux-fsdevel/qk6hjuam54khlaikf2ssom6custxf5is2ekkaequf4hvode3ls@zgf7j5j4ubvw/t/#u
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
---
net/ipv4/tcp.c | 30 +++---------------------------
1 file changed, 3 insertions(+), 27 deletions(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 3f66cdeef7de..09b562e2c1bf 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -782,7 +782,6 @@ ssize_t tcp_splice_read(struct socket *sock, loff_t *ppos,
.len = len,
.flags = flags,
};
- long timeo;
ssize_t spliced;
int ret;
@@ -797,7 +796,6 @@ ssize_t tcp_splice_read(struct socket *sock, loff_t *ppos,
lock_sock(sk);
- timeo = sock_rcvtimeo(sk, sock->file->f_flags & O_NONBLOCK);
while (tss.len) {
ret = __tcp_splice_read(sk, &tss);
if (ret < 0)
@@ -821,35 +819,13 @@ ssize_t tcp_splice_read(struct socket *sock, loff_t *ppos,
ret = -ENOTCONN;
break;
}
- if (!timeo) {
- ret = -EAGAIN;
- break;
- }
- /* if __tcp_splice_read() got nothing while we have
- * an skb in receive queue, we do not want to loop.
- * This might happen with URG data.
- */
- if (!skb_queue_empty(&sk->sk_receive_queue))
- break;
- sk_wait_data(sk, &timeo, NULL);
- if (signal_pending(current)) {
- ret = sock_intr_errno(timeo);
- break;
- }
- continue;
+ ret = -EAGAIN;
+ break;
}
tss.len -= ret;
spliced += ret;
- if (!tss.len || !timeo)
- break;
- release_sock(sk);
- lock_sock(sk);
-
- if (sk->sk_err || sk->sk_state == TCP_CLOSE ||
- (sk->sk_shutdown & RCV_SHUTDOWN) ||
- signal_pending(current))
- break;
+ break;
}
release_sock(sk);
--
2.39.2
Please add correct tag, for this patch, IIUC, it should be a fix, and
you need add [PATCH net].
On Tue, Dec 12, 2023 at 11:12:47AM +0100, Ahelenia Ziemia'nska wrote:
> Otherwise we risk sleeping with the pipe locked for indeterminate
> lengths of time.
>
> Link: https://lore.kernel.org/linux-fsdevel/qk6hjuam54khlaikf2ssom6custxf5is2ekkaequf4hvode3ls@zgf7j5j4ubvw/t/#u
Fixes line is needed.
> Signed-off-by: Ahelenia Ziemia'nska <[email protected]>
> ---
> net/smc/af_smc.c | 6 +-----
> 1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> index bacdd971615e..89473305f629 100644
> --- a/net/smc/af_smc.c
> +++ b/net/smc/af_smc.c
> @@ -3243,12 +3243,8 @@ static ssize_t smc_splice_read(struct socket *sock, loff_t *ppos,
> rc = -ESPIPE;
> goto out;
> }
> - if (flags & SPLICE_F_NONBLOCK)
> - flags = MSG_DONTWAIT;
> - else
> - flags = 0;
> SMC_STAT_INC(smc, splice_cnt);
> - rc = smc_rx_recvmsg(smc, NULL, pipe, len, flags);
> + rc = smc_rx_recvmsg(smc, NULL, pipe, len, MSG_DONTWAIT);
> }
> out:
> release_sock(sk);
> --
> 2.39.2
On Wed, 13 Dec 2023 09:48:47 +0800 Tony Lu wrote:
> Please add correct tag, for this patch, IIUC, it should be a fix, and
> you need add [PATCH net].
I was wondering who's expected to take this. We (netdev/net maintainers)
didn't even get CCed on all the patches in the series.
My sense is that this is more of a VFS change, so Al / Christian may be
better suited to take this?
Let's figure that out before we get another repost.
> Let's figure that out before we get another repost.
I'm just waiting for Jens to review it as he had comments on this
before.
On 12/14/23 3:50 AM, Christian Brauner wrote:
>> Let's figure that out before we get another repost.
>
> I'm just waiting for Jens to review it as he had comments on this
> before.
Well, I do wish the CC list had been setup a bit more deliberately.
Especially as this is a resend, and I didn't even know about any of this
before Christian pointed me this way the other day.
Checking lore, I can't even see all the patches. So while it may be
annoying, I do think it may be a good idea to resend the series so I can
take a closer look as well. I do think it's interesting and I'd love to
have it work in a non-blocking fashion, both solving the issue of splice
holding the pipe lock while doing IO, and also then being able to
eliminate the pipe_clear_nowait() hack hopefully.
--
Jens Axboe
On Thu, 14 Dec 2023 09:57:32 -0700 Jens Axboe wrote:
> On 12/14/23 3:50 AM, Christian Brauner wrote:
> >> Let's figure that out before we get another repost.
> >
> > I'm just waiting for Jens to review it as he had comments on this
> > before.
>
> Well, I do wish the CC list had been setup a bit more deliberately.
> Especially as this is a resend, and I didn't even know about any of this
> before Christian pointed me this way the other day.
>
> Checking lore, I can't even see all the patches. So while it may be
> annoying, I do think it may be a good idea to resend the series so I can
> take a closer look as well.
So to summarize - for the repost please make sure to CC Jens,
Christian, Al, [email protected] on *all* patches.
No need to add "net" to subject prefix, or CC net on all.
> I do think it's interesting and I'd love to
> have it work in a non-blocking fashion, both solving the issue of splice
> holding the pipe lock while doing IO, and also then being able to
> eliminate the pipe_clear_nowait() hack hopefully.