REPO: https://github.com/smipi1/linux-tinification.git
BRANCH: tiny/config-syscall-splice
BACKGROUND: This patch-set forms part of the Linux Kernel Tinification effort (
https://tiny.wiki.kernel.org/).
GOAL: Support compiling out the splice family of syscalls (splice, vmsplice,
tee and sendfile) along with all supporting infrastructure if not needed.
Many embedded systems will not need the splice-family syscalls. Omitting them
saves space.
HISTORY:
PATCH v6:
- Removed unnecessary addition of __maybe_unused in fs/fuse
PATCH v5:
- Fix up commit log still referring to dropped __splice_p()
PATCH v4:
- Drops __splice_p()
- Let nfsd fall back to non-splice support when splice is compiled out
- Style fixes
PATCH v3:
- Fixup commit logs so that they are consistent with patch strategy
- Style fixes
PATCH v2:
- Avoid the ifdef mess introduced in PATCH v1 by mocking out exported splice
functions.
STRATEGY:
a. With the goal of eventually compiling out fs/splice.c, several functions
that are only used in support of the the splice family of syscalls are moved
into fs/splice.c from fs/read_write.c. The kernel_write function that is not
used to support the splice syscalls is moved to fs/read_write.c.
b. Introduce an EXPERT kernel configuration option; CONFIG_SYSCALL_SPLICE; to
compile out the splice family of syscalls. This removes all userspace uses
of the splice infrastructure.
c. Splice exports an operations struct, nosteal_pipe_buf_ops. Eliminate the
uses of this struct when CONFIG_SYSCALL_SPLICE is undefined, so that splice
can later be compiled out.
d. Let nfsd fall back to non-splice support when splice is compiled out.
e. Compile out fs/splice.c. Functions exported by fs/splice are mocked out with
failing static inlines. This is done so as to all but eliminate the
maintenance burden on file-system drivers.
RESULTS: A tinyconfig bloat-o-meter score for the entire patch-set:
add/remove: 0/41 grow/shrink: 5/7 up/down: 23/-8422 (-8399)
function old new delta
sys_pwritev 115 122 +7
sys_preadv 115 122 +7
fdput_pos 29 36 +7
sys_pwrite64 115 116 +1
sys_pread64 115 116 +1
pipe_to_null 4 - -4
generic_pipe_buf_nosteal 6 - -6
spd_release_page 10 - -10
fdput 11 - -11
PageUptodate 22 11 -11
lock_page 36 24 -12
signal_pending 39 26 -13
fdget 56 42 -14
page_cache_pipe_buf_release 16 - -16
user_page_pipe_buf_ops 20 - -20
splice_write_null 24 4 -20
page_cache_pipe_buf_ops 20 - -20
nosteal_pipe_buf_ops 20 - -20
default_pipe_buf_ops 20 - -20
generic_splice_sendpage 24 - -24
user_page_pipe_buf_steal 25 - -25
splice_shrink_spd 27 - -27
pipe_to_user 43 - -43
direct_splice_actor 47 - -47
default_file_splice_write 49 - -49
wakeup_pipe_writers 54 - -54
wakeup_pipe_readers 54 - -54
write_pipe_buf 71 - -71
page_cache_pipe_buf_confirm 80 - -80
splice_grow_spd 87 - -87
do_splice_to 87 - -87
ipipe_prep.part 92 - -92
splice_from_pipe 93 - -93
splice_from_pipe_next 107 - -107
pipe_to_sendpage 109 - -109
page_cache_pipe_buf_steal 114 - -114
opipe_prep.part 119 - -119
sys_sendfile 122 - -122
generic_file_splice_read 131 8 -123
sys_sendfile64 126 - -126
sys_vmsplice 137 - -137
do_splice_direct 148 - -148
vmsplice_to_user 205 - -205
__splice_from_pipe 246 - -246
splice_direct_to_actor 348 - -348
splice_to_pipe 371 - -371
do_sendfile 492 - -492
sys_tee 497 - -497
vmsplice_to_pipe 558 - -558
default_file_splice_read 688 - -688
iter_file_splice_write 702 4 -698
sys_splice 1075 - -1075
__generic_file_splice_read 1109 - -1109
Pieter Smith (7):
fs: move sendfile syscall into fs/splice
fs: moved kernel_write to fs/read_write
fs/splice: support compiling out splice-family syscalls
fs/fuse: support compiling out splice
net/core: support compiling out splice
fs/nfsd: support compiling out splice
fs/splice: full support for compiling out splice
fs/Makefile | 3 +-
fs/fuse/dev.c | 4 +
fs/read_write.c | 181 +++------------------------------------------
fs/splice.c | 194 +++++++++++++++++++++++++++++++++++++++++++++----
include/linux/fs.h | 26 +++++++
include/linux/skbuff.h | 10 +++
include/linux/splice.h | 42 +++++++++++
init/Kconfig | 10 +++
kernel/sys_ni.c | 8 ++
net/core/skbuff.c | 11 ++-
net/sunrpc/svc.c | 2 +-
11 files changed, 299 insertions(+), 192 deletions(-)
--
2.1.0
sendfile functionally forms part of the splice group of syscalls (splice,
vmsplice and tee). Grouping sendfile with splice paves the way to compiling out
the splice group of syscalls for embedded systems that do not need these.
add/remove: 0/0 grow/shrink: 7/2 up/down: 86/-61 (25)
function old new delta
file_start_write 34 68 +34
file_end_write 29 58 +29
sys_pwritev 115 122 +7
sys_preadv 115 122 +7
fdput_pos 29 36 +7
sys_pwrite64 115 116 +1
sys_pread64 115 116 +1
sys_tee 497 491 -6
sys_splice 1075 1020 -55
Signed-off-by: Pieter Smith <[email protected]>
---
fs/read_write.c | 175 -------------------------------------------------------
fs/splice.c | 178 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 178 insertions(+), 175 deletions(-)
diff --git a/fs/read_write.c b/fs/read_write.c
index 7d9318c..d9451ba 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1191,178 +1191,3 @@ COMPAT_SYSCALL_DEFINE5(pwritev, compat_ulong_t, fd,
}
#endif
-static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
- size_t count, loff_t max)
-{
- struct fd in, out;
- struct inode *in_inode, *out_inode;
- loff_t pos;
- loff_t out_pos;
- ssize_t retval;
- int fl;
-
- /*
- * Get input file, and verify that it is ok..
- */
- retval = -EBADF;
- in = fdget(in_fd);
- if (!in.file)
- goto out;
- if (!(in.file->f_mode & FMODE_READ))
- goto fput_in;
- retval = -ESPIPE;
- if (!ppos) {
- pos = in.file->f_pos;
- } else {
- pos = *ppos;
- if (!(in.file->f_mode & FMODE_PREAD))
- goto fput_in;
- }
- retval = rw_verify_area(READ, in.file, &pos, count);
- if (retval < 0)
- goto fput_in;
- count = retval;
-
- /*
- * Get output file, and verify that it is ok..
- */
- retval = -EBADF;
- out = fdget(out_fd);
- if (!out.file)
- goto fput_in;
- if (!(out.file->f_mode & FMODE_WRITE))
- goto fput_out;
- retval = -EINVAL;
- in_inode = file_inode(in.file);
- out_inode = file_inode(out.file);
- out_pos = out.file->f_pos;
- retval = rw_verify_area(WRITE, out.file, &out_pos, count);
- if (retval < 0)
- goto fput_out;
- count = retval;
-
- if (!max)
- max = min(in_inode->i_sb->s_maxbytes, out_inode->i_sb->s_maxbytes);
-
- if (unlikely(pos + count > max)) {
- retval = -EOVERFLOW;
- if (pos >= max)
- goto fput_out;
- count = max - pos;
- }
-
- fl = 0;
-#if 0
- /*
- * We need to debate whether we can enable this or not. The
- * man page documents EAGAIN return for the output at least,
- * and the application is arguably buggy if it doesn't expect
- * EAGAIN on a non-blocking file descriptor.
- */
- if (in.file->f_flags & O_NONBLOCK)
- fl = SPLICE_F_NONBLOCK;
-#endif
- file_start_write(out.file);
- retval = do_splice_direct(in.file, &pos, out.file, &out_pos, count, fl);
- file_end_write(out.file);
-
- if (retval > 0) {
- add_rchar(current, retval);
- add_wchar(current, retval);
- fsnotify_access(in.file);
- fsnotify_modify(out.file);
- out.file->f_pos = out_pos;
- if (ppos)
- *ppos = pos;
- else
- in.file->f_pos = pos;
- }
-
- inc_syscr(current);
- inc_syscw(current);
- if (pos > max)
- retval = -EOVERFLOW;
-
-fput_out:
- fdput(out);
-fput_in:
- fdput(in);
-out:
- return retval;
-}
-
-SYSCALL_DEFINE4(sendfile, int, out_fd, int, in_fd, off_t __user *, offset, size_t, count)
-{
- loff_t pos;
- off_t off;
- ssize_t ret;
-
- if (offset) {
- if (unlikely(get_user(off, offset)))
- return -EFAULT;
- pos = off;
- ret = do_sendfile(out_fd, in_fd, &pos, count, MAX_NON_LFS);
- if (unlikely(put_user(pos, offset)))
- return -EFAULT;
- return ret;
- }
-
- return do_sendfile(out_fd, in_fd, NULL, count, 0);
-}
-
-SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd, loff_t __user *, offset, size_t, count)
-{
- loff_t pos;
- ssize_t ret;
-
- if (offset) {
- if (unlikely(copy_from_user(&pos, offset, sizeof(loff_t))))
- return -EFAULT;
- ret = do_sendfile(out_fd, in_fd, &pos, count, 0);
- if (unlikely(put_user(pos, offset)))
- return -EFAULT;
- return ret;
- }
-
- return do_sendfile(out_fd, in_fd, NULL, count, 0);
-}
-
-#ifdef CONFIG_COMPAT
-COMPAT_SYSCALL_DEFINE4(sendfile, int, out_fd, int, in_fd,
- compat_off_t __user *, offset, compat_size_t, count)
-{
- loff_t pos;
- off_t off;
- ssize_t ret;
-
- if (offset) {
- if (unlikely(get_user(off, offset)))
- return -EFAULT;
- pos = off;
- ret = do_sendfile(out_fd, in_fd, &pos, count, MAX_NON_LFS);
- if (unlikely(put_user(pos, offset)))
- return -EFAULT;
- return ret;
- }
-
- return do_sendfile(out_fd, in_fd, NULL, count, 0);
-}
-
-COMPAT_SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd,
- compat_loff_t __user *, offset, compat_size_t, count)
-{
- loff_t pos;
- ssize_t ret;
-
- if (offset) {
- if (unlikely(copy_from_user(&pos, offset, sizeof(loff_t))))
- return -EFAULT;
- ret = do_sendfile(out_fd, in_fd, &pos, count, 0);
- if (unlikely(put_user(pos, offset)))
- return -EFAULT;
- return ret;
- }
-
- return do_sendfile(out_fd, in_fd, NULL, count, 0);
-}
-#endif
diff --git a/fs/splice.c b/fs/splice.c
index f5cb9ba..c1a2861 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -28,6 +28,7 @@
#include <linux/export.h>
#include <linux/syscalls.h>
#include <linux/uio.h>
+#include <linux/fsnotify.h>
#include <linux/security.h>
#include <linux/gfp.h>
#include <linux/socket.h>
@@ -2039,3 +2040,180 @@ SYSCALL_DEFINE4(tee, int, fdin, int, fdout, size_t, len, unsigned int, flags)
return error;
}
+
+static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
+ size_t count, loff_t max)
+{
+ struct fd in, out;
+ struct inode *in_inode, *out_inode;
+ loff_t pos;
+ loff_t out_pos;
+ ssize_t retval;
+ int fl;
+
+ /*
+ * Get input file, and verify that it is ok..
+ */
+ retval = -EBADF;
+ in = fdget(in_fd);
+ if (!in.file)
+ goto out;
+ if (!(in.file->f_mode & FMODE_READ))
+ goto fput_in;
+ retval = -ESPIPE;
+ if (!ppos) {
+ pos = in.file->f_pos;
+ } else {
+ pos = *ppos;
+ if (!(in.file->f_mode & FMODE_PREAD))
+ goto fput_in;
+ }
+ retval = rw_verify_area(READ, in.file, &pos, count);
+ if (retval < 0)
+ goto fput_in;
+ count = retval;
+
+ /*
+ * Get output file, and verify that it is ok..
+ */
+ retval = -EBADF;
+ out = fdget(out_fd);
+ if (!out.file)
+ goto fput_in;
+ if (!(out.file->f_mode & FMODE_WRITE))
+ goto fput_out;
+ retval = -EINVAL;
+ in_inode = file_inode(in.file);
+ out_inode = file_inode(out.file);
+ out_pos = out.file->f_pos;
+ retval = rw_verify_area(WRITE, out.file, &out_pos, count);
+ if (retval < 0)
+ goto fput_out;
+ count = retval;
+
+ if (!max)
+ max = min(in_inode->i_sb->s_maxbytes, out_inode->i_sb->s_maxbytes);
+
+ if (unlikely(pos + count > max)) {
+ retval = -EOVERFLOW;
+ if (pos >= max)
+ goto fput_out;
+ count = max - pos;
+ }
+
+ fl = 0;
+#if 0
+ /*
+ * We need to debate whether we can enable this or not. The
+ * man page documents EAGAIN return for the output at least,
+ * and the application is arguably buggy if it doesn't expect
+ * EAGAIN on a non-blocking file descriptor.
+ */
+ if (in.file->f_flags & O_NONBLOCK)
+ fl = SPLICE_F_NONBLOCK;
+#endif
+ file_start_write(out.file);
+ retval = do_splice_direct(in.file, &pos, out.file, &out_pos, count, fl);
+ file_end_write(out.file);
+
+ if (retval > 0) {
+ add_rchar(current, retval);
+ add_wchar(current, retval);
+ fsnotify_access(in.file);
+ fsnotify_modify(out.file);
+ out.file->f_pos = out_pos;
+ if (ppos)
+ *ppos = pos;
+ else
+ in.file->f_pos = pos;
+ }
+
+ inc_syscr(current);
+ inc_syscw(current);
+ if (pos > max)
+ retval = -EOVERFLOW;
+
+fput_out:
+ fdput(out);
+fput_in:
+ fdput(in);
+out:
+ return retval;
+}
+
+SYSCALL_DEFINE4(sendfile, int, out_fd, int, in_fd, off_t __user *, offset, size_t, count)
+{
+ loff_t pos;
+ off_t off;
+ ssize_t ret;
+
+ if (offset) {
+ if (unlikely(get_user(off, offset)))
+ return -EFAULT;
+ pos = off;
+ ret = do_sendfile(out_fd, in_fd, &pos, count, MAX_NON_LFS);
+ if (unlikely(put_user(pos, offset)))
+ return -EFAULT;
+ return ret;
+ }
+
+ return do_sendfile(out_fd, in_fd, NULL, count, 0);
+}
+
+SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd, loff_t __user *, offset, size_t, count)
+{
+ loff_t pos;
+ ssize_t ret;
+
+ if (offset) {
+ if (unlikely(copy_from_user(&pos, offset, sizeof(loff_t))))
+ return -EFAULT;
+ ret = do_sendfile(out_fd, in_fd, &pos, count, 0);
+ if (unlikely(put_user(pos, offset)))
+ return -EFAULT;
+ return ret;
+ }
+
+ return do_sendfile(out_fd, in_fd, NULL, count, 0);
+}
+
+#ifdef CONFIG_COMPAT
+COMPAT_SYSCALL_DEFINE4(sendfile, int, out_fd, int, in_fd,
+ compat_off_t __user *, offset, compat_size_t, count)
+{
+ loff_t pos;
+ off_t off;
+ ssize_t ret;
+
+ if (offset) {
+ if (unlikely(get_user(off, offset)))
+ return -EFAULT;
+ pos = off;
+ ret = do_sendfile(out_fd, in_fd, &pos, count, MAX_NON_LFS);
+ if (unlikely(put_user(pos, offset)))
+ return -EFAULT;
+ return ret;
+ }
+
+ return do_sendfile(out_fd, in_fd, NULL, count, 0);
+}
+
+COMPAT_SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd,
+ compat_loff_t __user *, offset, compat_size_t, count)
+{
+ loff_t pos;
+ ssize_t ret;
+
+ if (offset) {
+ if (unlikely(copy_from_user(&pos, offset, sizeof(loff_t))))
+ return -EFAULT;
+ ret = do_sendfile(out_fd, in_fd, &pos, count, 0);
+ if (unlikely(put_user(pos, offset)))
+ return -EFAULT;
+ return ret;
+ }
+
+ return do_sendfile(out_fd, in_fd, NULL, count, 0);
+}
+#endif
+
--
2.1.0
Many embedded systems will not need the splice-family syscalls (splice,
vmsplice, tee and sendfile). Omitting them saves space. This adds a new EXPERT
config option CONFIG_SYSCALL_SPLICE (default y) to support compiling them out.
The goal is to completely compile out fs/splice along with the syscalls. To
achieve this, the remaining patch-set will deal with fs/splice exports. As far
as possible, the impact on other device drivers will be minimized so as to
reduce the overal maintenance burden of CONFIG_SYSCALL_SPLICE.
The use of exported functions will be solved by transparently mocking them out
with static inlines. Uses of the exported pipe_buf_operations struct however
require direct modification in fs/fuse and net/core. The next two patches will
deal with this.
The last change required before fs/splice can be comipled out is making fs/nfsd
aware of the lacking splice support in file-systems when CONFIG_SYSCALL_SPLICE
is undefined.
The bloat benefit of this patch given a tinyconfig is:
add/remove: 0/16 grow/shrink: 2/5 up/down: 114/-3693 (-3579)
function old new delta
splice_direct_to_actor 348 416 +68
splice_to_pipe 371 417 +46
splice_from_pipe_next 107 106 -1
fdput 11 - -11
signal_pending 39 26 -13
fdget 56 42 -14
user_page_pipe_buf_ops 20 - -20
user_page_pipe_buf_steal 25 - -25
file_end_write 58 29 -29
file_start_write 68 34 -34
pipe_to_user 43 - -43
wakeup_pipe_readers 54 - -54
do_splice_to 87 - -87
ipipe_prep.part 92 - -92
opipe_prep.part 119 - -119
sys_sendfile 122 - -122
sys_sendfile64 126 - -126
sys_vmsplice 137 - -137
vmsplice_to_user 205 - -205
sys_tee 491 - -491
do_sendfile 492 - -492
vmsplice_to_pipe 558 - -558
sys_splice 1020 - -1020
Signed-off-by: Pieter Smith <[email protected]>
---
fs/splice.c | 2 ++
init/Kconfig | 10 ++++++++++
kernel/sys_ni.c | 8 ++++++++
3 files changed, 20 insertions(+)
diff --git a/fs/splice.c b/fs/splice.c
index 44b201b..7c4c695 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1316,6 +1316,7 @@ long do_splice_direct(struct file *in, loff_t *ppos, struct file *out,
return ret;
}
+#ifdef CONFIG_SYSCALL_SPLICE
static int splice_pipe_to_pipe(struct pipe_inode_info *ipipe,
struct pipe_inode_info *opipe,
size_t len, unsigned int flags);
@@ -2200,4 +2201,5 @@ COMPAT_SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd,
return do_sendfile(out_fd, in_fd, NULL, count, 0);
}
#endif
+#endif
diff --git a/init/Kconfig b/init/Kconfig
index d811d5f..dec9819 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1571,6 +1571,16 @@ config NTP
system clock to an NTP server, you can disable this option to save
space.
+config SYSCALL_SPLICE
+ bool "Enable splice/vmsplice/tee/sendfile syscalls" if EXPERT
+ default y
+ help
+ This option enables the splice, vmsplice, tee and sendfile syscalls. These
+ are used by applications to: move data between buffers and arbitrary file
+ descriptors; "copy" data between buffers; or copy data from userspace into
+ buffers. If building an embedded system where no applications use these
+ syscalls, you can disable this option to save space.
+
config PCI_QUIRKS
default y
bool "Enable PCI quirk workarounds" if EXPERT
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index d2f5b00..25d5551 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -170,6 +170,14 @@ cond_syscall(sys_fstat);
cond_syscall(sys_stat);
cond_syscall(sys_uname);
cond_syscall(sys_olduname);
+cond_syscall(sys_vmsplice);
+cond_syscall(sys_splice);
+cond_syscall(sys_tee);
+cond_syscall(sys_sendfile);
+cond_syscall(sys_sendfile64);
+cond_syscall(compat_sys_vmsplice);
+cond_syscall(compat_sys_sendfile);
+cond_syscall(compat_sys_sendfile64);
/* arch-specific weak syscall entries */
cond_syscall(sys_pciconfig_read);
--
2.1.0
To implement splice support, fs/fuse makes use of nosteal_pipe_buf_ops. This
struct is exported by fs/splice. The goal of the larger patch set is to
completely compile out fs/splice, so uses of the exported struct need to be
compiled out along with fs/splice.
This patch therefore compiles out splice support in fs/fuse when
CONFIG_SYSCALL_SPLICE is undefined.
Signed-off-by: Pieter Smith <[email protected]>
---
fs/fuse/dev.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index ca88731..99f1ff4 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -1291,6 +1291,7 @@ static ssize_t fuse_dev_read(struct kiocb *iocb, const struct iovec *iov,
return fuse_dev_do_read(fc, file, &cs, iov_length(iov, nr_segs));
}
+#ifdef CONFIG_SYSCALL_SPLICE
static ssize_t fuse_dev_splice_read(struct file *in, loff_t *ppos,
struct pipe_inode_info *pipe,
size_t len, unsigned int flags)
@@ -1368,6 +1369,9 @@ out:
kfree(bufs);
return ret;
}
+#else /* CONFIG_SYSCALL_SPLICE */
+#define fuse_dev_splice_read NULL
+#endif
static int fuse_notify_poll(struct fuse_conn *fc, unsigned int size,
struct fuse_copy_state *cs)
--
2.1.0
The goal of the larger patch set is to completely compile out fs/splice, and
as a result, splice support for all file-systems. This patch ensures that
fs/nfsd falls back to non-splice fs support when CONFIG_SYSCALL_SPLICE is
undefined.
Signed-off-by: Pieter Smith <[email protected]>
---
net/sunrpc/svc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index ca8a795..6cacc37 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -1084,7 +1084,7 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)
goto err_short_len;
/* Will be turned off only in gss privacy case: */
- rqstp->rq_splice_ok = true;
+ rqstp->rq_splice_ok = IS_ENABLED(CONFIG_SPLICE_SYSCALL);
/* Will be turned off only when NFSv4 Sessions are used */
rqstp->rq_usedeferral = true;
rqstp->rq_dropme = false;
--
2.1.0
Entirely compile out splice translation unit when the system is configured
without splice family of syscalls (i.e. CONFIG_SYSCALL_SPLICE is undefined).
Exported fs/splice functions are transparently mocked out with static inlines.
Because userspace support for splice has already been removed by this
patch-set, the exported functions cannot be called anyway. Mocking them out
prevents a maintenance burden on file system drivers.
The bloat score resulting from this patch given a tinyconfig is:
add/remove: 0/25 grow/shrink: 0/5 up/down: 0/-4845 (-4845)
function old new delta
pipe_to_null 4 - -4
generic_pipe_buf_nosteal 6 - -6
spd_release_page 10 - -10
PageUptodate 22 11 -11
lock_page 36 24 -12
page_cache_pipe_buf_release 16 - -16
splice_write_null 24 4 -20
page_cache_pipe_buf_ops 20 - -20
nosteal_pipe_buf_ops 20 - -20
default_pipe_buf_ops 20 - -20
generic_splice_sendpage 24 - -24
splice_shrink_spd 27 - -27
direct_splice_actor 47 - -47
default_file_splice_write 49 - -49
wakeup_pipe_writers 54 - -54
write_pipe_buf 71 - -71
page_cache_pipe_buf_confirm 80 - -80
splice_grow_spd 87 - -87
splice_from_pipe 93 - -93
splice_from_pipe_next 106 - -106
pipe_to_sendpage 109 - -109
page_cache_pipe_buf_steal 114 - -114
generic_file_splice_read 131 8 -123
do_splice_direct 148 - -148
__splice_from_pipe 246 - -246
splice_direct_to_actor 416 - -416
splice_to_pipe 417 - -417
default_file_splice_read 688 - -688
iter_file_splice_write 702 4 -698
__generic_file_splice_read 1109 - -1109
The bloat score for the entire CONFIG_SYSCALL_SPLICE patch-set is:
add/remove: 0/41 grow/shrink: 5/7 up/down: 23/-8422 (-8399)
function old new delta
sys_pwritev 115 122 +7
sys_preadv 115 122 +7
fdput_pos 29 36 +7
sys_pwrite64 115 116 +1
sys_pread64 115 116 +1
pipe_to_null 4 - -4
generic_pipe_buf_nosteal 6 - -6
spd_release_page 10 - -10
fdput 11 - -11
PageUptodate 22 11 -11
lock_page 36 24 -12
signal_pending 39 26 -13
fdget 56 42 -14
page_cache_pipe_buf_release 16 - -16
user_page_pipe_buf_ops 20 - -20
splice_write_null 24 4 -20
page_cache_pipe_buf_ops 20 - -20
nosteal_pipe_buf_ops 20 - -20
default_pipe_buf_ops 20 - -20
generic_splice_sendpage 24 - -24
user_page_pipe_buf_steal 25 - -25
splice_shrink_spd 27 - -27
pipe_to_user 43 - -43
direct_splice_actor 47 - -47
default_file_splice_write 49 - -49
wakeup_pipe_writers 54 - -54
wakeup_pipe_readers 54 - -54
write_pipe_buf 71 - -71
page_cache_pipe_buf_confirm 80 - -80
splice_grow_spd 87 - -87
do_splice_to 87 - -87
ipipe_prep.part 92 - -92
splice_from_pipe 93 - -93
splice_from_pipe_next 107 - -107
pipe_to_sendpage 109 - -109
page_cache_pipe_buf_steal 114 - -114
opipe_prep.part 119 - -119
sys_sendfile 122 - -122
generic_file_splice_read 131 8 -123
sys_sendfile64 126 - -126
sys_vmsplice 137 - -137
do_splice_direct 148 - -148
vmsplice_to_user 205 - -205
__splice_from_pipe 246 - -246
splice_direct_to_actor 348 - -348
splice_to_pipe 371 - -371
do_sendfile 492 - -492
sys_tee 497 - -497
vmsplice_to_pipe 558 - -558
default_file_splice_read 688 - -688
iter_file_splice_write 702 4 -698
sys_splice 1075 - -1075
__generic_file_splice_read 1109 - -1109
Signed-off-by: Pieter Smith <[email protected]>
---
fs/Makefile | 3 ++-
fs/splice.c | 2 --
include/linux/fs.h | 26 ++++++++++++++++++++++++++
include/linux/splice.h | 42 ++++++++++++++++++++++++++++++++++++++++++
4 files changed, 70 insertions(+), 3 deletions(-)
diff --git a/fs/Makefile b/fs/Makefile
index fb7646e..9395622 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -10,7 +10,7 @@ obj-y := open.o read_write.o file_table.o super.o \
ioctl.o readdir.o select.o dcache.o inode.o \
attr.o bad_inode.o file.o filesystems.o namespace.o \
seq_file.o xattr.o libfs.o fs-writeback.o \
- pnode.o splice.o sync.o utimes.o \
+ pnode.o sync.o utimes.o \
stack.o fs_struct.o statfs.o fs_pin.o
ifeq ($(CONFIG_BLOCK),y)
@@ -22,6 +22,7 @@ endif
obj-$(CONFIG_PROC_FS) += proc_namespace.o
obj-$(CONFIG_FSNOTIFY) += notify/
+obj-$(CONFIG_SYSCALL_SPLICE) += splice.o
obj-$(CONFIG_EPOLL) += eventpoll.o
obj-$(CONFIG_ANON_INODES) += anon_inodes.o
obj-$(CONFIG_SIGNALFD) += signalfd.o
diff --git a/fs/splice.c b/fs/splice.c
index 7c4c695..44b201b 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1316,7 +1316,6 @@ long do_splice_direct(struct file *in, loff_t *ppos, struct file *out,
return ret;
}
-#ifdef CONFIG_SYSCALL_SPLICE
static int splice_pipe_to_pipe(struct pipe_inode_info *ipipe,
struct pipe_inode_info *opipe,
size_t len, unsigned int flags);
@@ -2201,5 +2200,4 @@ COMPAT_SYSCALL_DEFINE4(sendfile64, int, out_fd, int, in_fd,
return do_sendfile(out_fd, in_fd, NULL, count, 0);
}
#endif
-#endif
diff --git a/include/linux/fs.h b/include/linux/fs.h
index a957d43..138107e 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2444,6 +2444,7 @@ extern int blkdev_fsync(struct file *filp, loff_t start, loff_t end,
extern void block_sync_page(struct page *page);
/* fs/splice.c */
+#ifdef CONFIG_SYSCALL_SPLICE
extern ssize_t generic_file_splice_read(struct file *, loff_t *,
struct pipe_inode_info *, size_t, unsigned int);
extern ssize_t default_file_splice_read(struct file *, loff_t *,
@@ -2452,6 +2453,31 @@ extern ssize_t iter_file_splice_write(struct pipe_inode_info *,
struct file *, loff_t *, size_t, unsigned int);
extern ssize_t generic_splice_sendpage(struct pipe_inode_info *pipe,
struct file *out, loff_t *, size_t len, unsigned int flags);
+#else
+static inline ssize_t generic_file_splice_read(struct file *in, loff_t *ppos,
+ struct pipe_inode_info *pipe, size_t len, unsigned int flags)
+{
+ return -EPERM;
+}
+
+static inline ssize_t default_file_splice_read(struct file *in, loff_t *ppos,
+ struct pipe_inode_info *pipe, size_t len, unsigned int flags)
+{
+ return -EPERM;
+}
+
+static inline ssize_t iter_file_splice_write(struct pipe_inode_info *pipe,
+ struct file *out, loff_t *ppos, size_t len, unsigned int flags)
+{
+ return -EPERM;
+}
+
+static inline ssize_t generic_splice_sendpage(struct pipe_inode_info *pipe,
+ struct file *out, loff_t *ppos, size_t len, unsigned int flags)
+{
+ return -EPERM;
+}
+#endif
extern void
file_ra_state_init(struct file_ra_state *ra, struct address_space *mapping);
diff --git a/include/linux/splice.h b/include/linux/splice.h
index da2751d..34570d8 100644
--- a/include/linux/splice.h
+++ b/include/linux/splice.h
@@ -65,6 +65,7 @@ typedef int (splice_actor)(struct pipe_inode_info *, struct pipe_buffer *,
typedef int (splice_direct_actor)(struct pipe_inode_info *,
struct splice_desc *);
+#ifdef CONFIG_SYSCALL_SPLICE
extern ssize_t splice_from_pipe(struct pipe_inode_info *, struct file *,
loff_t *, size_t, unsigned int,
splice_actor *);
@@ -74,13 +75,54 @@ extern ssize_t splice_to_pipe(struct pipe_inode_info *,
struct splice_pipe_desc *);
extern ssize_t splice_direct_to_actor(struct file *, struct splice_desc *,
splice_direct_actor *);
+#else
+static inline ssize_t splice_from_pipe(struct pipe_inode_info *pipe, struct file *out,
+ loff_t *ppos, size_t len, unsigned int flags,
+ splice_actor *actor)
+{
+ return -EPERM;
+}
+
+static inline ssize_t __splice_from_pipe(struct pipe_inode_info *pipe, struct splice_desc *sd,
+ splice_actor *actor)
+{
+ return -EPERM;
+}
+
+static inline ssize_t splice_to_pipe(struct pipe_inode_info *pipe,
+ struct splice_pipe_desc *spd)
+{
+ return -EPERM;
+}
+
+static inline ssize_t splice_direct_to_actor(struct file *in, struct splice_desc *sd,
+ splice_direct_actor *actor)
+{
+ return -EPERM;
+}
+#endif
/*
* for dynamic pipe sizing
*/
+#ifdef CONFIG_SYSCALL_SPLICE
extern int splice_grow_spd(const struct pipe_inode_info *, struct splice_pipe_desc *);
extern void splice_shrink_spd(struct splice_pipe_desc *);
extern void spd_release_page(struct splice_pipe_desc *, unsigned int);
+#else
+static inline int splice_grow_spd(const struct pipe_inode_info *pipe, struct splice_pipe_desc *spd)
+{
+ return -EPERM;
+}
+
+static inline void splice_shrink_spd(struct splice_pipe_desc *spd)
+{
+}
+
+static inline void spd_release_page(struct splice_pipe_desc *spd, unsigned int i)
+{
+}
+#endif
extern const struct pipe_buf_operations page_cache_pipe_buf_ops;
#endif
--
2.1.0
To implement splice support, net/core makes use of nosteal_pipe_buf_ops. This
struct is exported by fs/splice. The goal of the larger patch set is to
completely compile out fs/splice, so uses of the exported struct need to be
compiled out along with fs/splice.
This patch therefore compiles out splice support in net/core when
CONFIG_SYSCALL_SPLICE is undefined. The compiled out function skb_splice_bits
is transparently mocked out with a static inline. The greater patch set removes
userspace splice support so it cannot be called anyway.
Signed-off-by: Pieter Smith <[email protected]>
---
include/linux/skbuff.h | 10 ++++++++++
net/core/skbuff.c | 11 +++++++----
2 files changed, 17 insertions(+), 4 deletions(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index a59d934..5cd636b 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2640,9 +2640,19 @@ int skb_copy_bits(const struct sk_buff *skb, int offset, void *to, int len);
int skb_store_bits(struct sk_buff *skb, int offset, const void *from, int len);
__wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset, u8 *to,
int len, __wsum csum);
+#ifdef CONFIG_SYSCALL_SPLICE
int skb_splice_bits(struct sk_buff *skb, unsigned int offset,
struct pipe_inode_info *pipe, unsigned int len,
unsigned int flags);
+#else
+static inline int
+skb_splice_bits(struct sk_buff *skb, unsigned int offset,
+ struct pipe_inode_info *pipe, unsigned int len,
+ unsigned int flags)
+{
+ return -EPERM;
+}
+#endif
void skb_copy_and_csum_dev(const struct sk_buff *skb, u8 *to);
unsigned int skb_zerocopy_headlen(const struct sk_buff *from);
int skb_zerocopy(struct sk_buff *to, struct sk_buff *from,
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 61059a0..bb426d9 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1678,7 +1678,8 @@ EXPORT_SYMBOL(skb_copy_bits);
* Callback from splice_to_pipe(), if we need to release some pages
* at the end of the spd in case we error'ed out in filling the pipe.
*/
-static void sock_spd_release(struct splice_pipe_desc *spd, unsigned int i)
+static void __maybe_unused sock_spd_release(struct splice_pipe_desc *spd,
+ unsigned int i)
{
put_page(spd->pages[i]);
}
@@ -1781,9 +1782,9 @@ static bool __splice_segment(struct page *page, unsigned int poff,
* Map linear and fragment data from the skb to spd. It reports true if the
* pipe is full or if we already spliced the requested length.
*/
-static bool __skb_splice_bits(struct sk_buff *skb, struct pipe_inode_info *pipe,
- unsigned int *offset, unsigned int *len,
- struct splice_pipe_desc *spd, struct sock *sk)
+static bool __maybe_unused __skb_splice_bits(struct sk_buff *skb, struct pipe_inode_info *pipe,
+ unsigned int *offset, unsigned int *len,
+ struct splice_pipe_desc *spd, struct sock *sk)
{
int seg;
@@ -1821,6 +1822,7 @@ static bool __skb_splice_bits(struct sk_buff *skb, struct pipe_inode_info *pipe,
* the frag list, if such a thing exists. We'd probably need to recurse to
* handle that cleanly.
*/
+#ifdef CONFIG_SYSCALL_SPLICE
int skb_splice_bits(struct sk_buff *skb, unsigned int offset,
struct pipe_inode_info *pipe, unsigned int tlen,
unsigned int flags)
@@ -1876,6 +1878,7 @@ done:
return ret;
}
+#endif /* CONFIG_SYSCALL_SPLICE */
/**
* skb_store_bits - store bits from kernel buffer to skb
--
2.1.0
kernel_write shares infrastructure with the read_write translation unit but not
with the splice translation unit. Grouping kernel_write with the read_write
translation unit is more logical. It also paves the way to compiling out the
splice group of syscalls for embedded systems that do not need them.
Signed-off-by: Pieter Smith <[email protected]>
---
fs/read_write.c | 16 ++++++++++++++++
fs/splice.c | 16 ----------------
2 files changed, 16 insertions(+), 16 deletions(-)
diff --git a/fs/read_write.c b/fs/read_write.c
index d9451ba..f4c8d8b 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1191,3 +1191,19 @@ COMPAT_SYSCALL_DEFINE5(pwritev, compat_ulong_t, fd,
}
#endif
+ssize_t kernel_write(struct file *file, const char *buf, size_t count,
+ loff_t pos)
+{
+ mm_segment_t old_fs;
+ ssize_t res;
+
+ old_fs = get_fs();
+ set_fs(get_ds());
+ /* The cast to a user pointer is valid due to the set_fs() */
+ res = vfs_write(file, (__force const char __user *)buf, count, &pos);
+ set_fs(old_fs);
+
+ return res;
+}
+EXPORT_SYMBOL(kernel_write);
+
diff --git a/fs/splice.c b/fs/splice.c
index c1a2861..44b201b 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -583,22 +583,6 @@ static ssize_t kernel_readv(struct file *file, const struct iovec *vec,
return res;
}
-ssize_t kernel_write(struct file *file, const char *buf, size_t count,
- loff_t pos)
-{
- mm_segment_t old_fs;
- ssize_t res;
-
- old_fs = get_fs();
- set_fs(get_ds());
- /* The cast to a user pointer is valid due to the set_fs() */
- res = vfs_write(file, (__force const char __user *)buf, count, &pos);
- set_fs(old_fs);
-
- return res;
-}
-EXPORT_SYMBOL(kernel_write);
-
ssize_t default_file_splice_read(struct file *in, loff_t *ppos,
struct pipe_inode_info *pipe, size_t len,
unsigned int flags)
--
2.1.0
On Thu, Dec 4, 2014 at 6:50 PM, Pieter Smith <[email protected]> wrote:
> To implement splice support, fs/fuse makes use of nosteal_pipe_buf_ops. This
> struct is exported by fs/splice. The goal of the larger patch set is to
> completely compile out fs/splice, so uses of the exported struct need to be
> compiled out along with fs/splice.
>
> This patch therefore compiles out splice support in fs/fuse when
> CONFIG_SYSCALL_SPLICE is undefined.
>
> Signed-off-by: Pieter Smith <[email protected]>
In the future could you PLEASE PLEASE cut the fuse-devel Cc from the
non-fuse specific patches (and I guess that goes for any other
subsystem specific lists and persons as well)?
Otherwise:
Acked-by: Miklos Szeredi <[email protected]>
Thanks,
Miklos
On Fri, Dec 05, 2014 at 11:44:36AM +0100, Miklos Szeredi wrote:
> On Thu, Dec 4, 2014 at 6:50 PM, Pieter Smith <[email protected]> wrote:
> > To implement splice support, fs/fuse makes use of nosteal_pipe_buf_ops. This
> > struct is exported by fs/splice. The goal of the larger patch set is to
> > completely compile out fs/splice, so uses of the exported struct need to be
> > compiled out along with fs/splice.
> >
> > This patch therefore compiles out splice support in fs/fuse when
> > CONFIG_SYSCALL_SPLICE is undefined.
> >
> > Signed-off-by: Pieter Smith <[email protected]>
>
>
> In the future could you PLEASE PLEASE cut the fuse-devel Cc from the
> non-fuse specific patches (and I guess that goes for any other
> subsystem specific lists and persons as well)?
>
> Otherwise:
>
> Acked-by: Miklos Szeredi <[email protected]>
>
> Thanks,
> Miklos
I added your Acked-by and removed you and fuse-dev from postings on future
versions of this patch-set.
Thanks,
Pieter