Christoph's fs/read_write.c series - consolidation and
cleanups.
The following changes since commit 20223f0f39ea9d31ece08f04ac79f8c4e8d98246:
fs: pass on flags in compat_writev (2017-06-16 18:40:51 +0900)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git work.read_write
for you to fetch changes up to a4058c5bce8aded1a12a59990e84e481a96fb490:
nfsd: remove nfsd_vfs_read (2017-06-29 17:49:24 -0400)
----------------------------------------------------------------
Christoph Hellwig (8):
fs: remove do_readv_writev
fs: remove do_compat_readv_writev
fs: remove __do_readv_writev
fs: move more code into do_iter_read/do_iter_write
fs: implement vfs_iter_read using do_iter_read
fs: implement vfs_iter_write using do_iter_write
nfsd: use vfs_iter_read/write
nfsd: remove nfsd_vfs_read
drivers/block/loop.c | 6 +-
drivers/target/target_core_file.c | 6 +-
fs/coda/file.c | 4 +-
fs/nfsd/vfs.c | 34 +++---
fs/read_write.c | 220 ++++++++++++++++----------------------
fs/splice.c | 2 +-
include/linux/fs.h | 6 +-
7 files changed, 117 insertions(+), 161 deletions(-)
On Wed, Jul 5, 2017 at 12:14 AM, Al Viro <[email protected]> wrote:
>
> Christoph's fs/read_write.c series - consolidation and cleanups.
Side note - when looking through this, it struck me how confusing that
"int flags" argument was.
We have a ton of "flags" in the filesystem layer, and how all the
read/write helpers take them too, and it's really hard to see what
kind of flags they are.
Could we perhaps make those RWF_xyz flags have a nice bitwise type,
and use that type in the argument list, so that not only could there
be some sparse typechecking, but the functions that pass flags on to
each other would automatically have a certain amount of actual
self-documenting prototypes?
So when you look at one of those vfs_iter_write() or whatever
functions, you just see *what* flags the flags argument is.
Because "int flags" really is the worst. It's the wrong type anyway
(at least make it unsigned if it's a collection of bits), but it's
also very ambiguous indeed when there are so many other flags that are
often used/tested in the same functions (there's the "iter" flagsm,
there's file->f_mode, there's just a lot of different flags going on,
and the "int flags" is the least well documented of them all,
particularly since 99.9% of all users just pass in zero).
Hmm?
Linus
On Wed, Jul 05, 2017 at 02:51:43PM -0700, Linus Torvalds wrote:
> On Wed, Jul 5, 2017 at 12:14 AM, Al Viro <[email protected]> wrote:
> >
> > Christoph's fs/read_write.c series - consolidation and cleanups.
>
> Side note - when looking through this, it struck me how confusing that
> "int flags" argument was.
>
> We have a ton of "flags" in the filesystem layer, and how all the
> read/write helpers take them too, and it's really hard to see what
> kind of flags they are.
>
> Could we perhaps make those RWF_xyz flags have a nice bitwise type,
> and use that type in the argument list, so that not only could there
> be some sparse typechecking, but the functions that pass flags on to
> each other would automatically have a certain amount of actual
> self-documenting prototypes?
>
> So when you look at one of those vfs_iter_write() or whatever
> functions, you just see *what* flags the flags argument is.
>
> Because "int flags" really is the worst. It's the wrong type anyway
> (at least make it unsigned if it's a collection of bits), but it's
> also very ambiguous indeed when there are so many other flags that are
> often used/tested in the same functions (there's the "iter" flagsm,
> there's file->f_mode, there's just a lot of different flags going on,
> and the "int flags" is the least well documented of them all,
> particularly since 99.9% of all users just pass in zero).
Sure, makes sense - especially since it's not too widely spread yet.
A side note right back at you - POLL... stuff. I'd redone the old
"hunt the buggy ->poll() instances down" series (took about 12 hours
total), got it to the point where all remaining sparse warnings about
that type are for genuine bugs. It goes like that:
define __poll_t, annotate constants
Type is controlled by ifdef - it's unsigned int unless CHECK_POLL is
defined and a bitwise type otherwise.
->poll() methods should return __poll_t
anntotate the places where ->poll() return values go
annotate poll-related wait keys
annotate poll_table_struct ->_key
That ends all infrastructure work. Methods declarations are annotated,
instances are *not*. Due to that ifdef CHECK_POLL, normal builds, including
normal sparse builds, are unaffected; with CF=-DCHECK_POLL you get __poll_t
warnings.
cris: annotate ->poll() instances
ia64: annotate ->poll() instances
mips: annotate ->poll() instances
ppc: annotate ->poll() instances
um: annotate ->poll() instances
x86: annotate ->poll() instances
block: annotate ->poll() instances
crypto: annotate ->poll() instances
acpi: annotate ->poll() instances
sound: annotate ->poll() instances
tomoyo: annotate ->poll() instances
net: annotate ->poll() instances
ipc, kernel, mm: annotate ->poll() instances
fs: annotate ->poll() instances
media: annotate ->poll() instances
the rest of drivers/*: annotate ->poll() instances
These can be folded and split as desired - almost up to per-instance. It's
pretty much "turn unsigned int foo_poll(...) into __poll_t foo_poll(...),
turn unsigned int mask; in it into __poll_t mask;" kind of stuff. Can go
on per-subsystem basis just fine - again, normal builds are completely unaffected.
scif: annotate scif_pollepd
vhost: annotate vhost_poll
dmabuf: annotate dma_buf->active
Several drivers playing games of their own with POLL... bitmaps.
annotate fs/select.c and fs/eventpoll.c
That, of course, can move up right after the infrastructure.
<fixes for assorted bugs caught by all that>
Again, can be reordered in front of the entire queue. Some are brainos
(POLL_IN instead of POLLIN - compare the kernel definitions of those),
some are "what do you mean, no returning -E... from ->poll()?". However,
there's the shitty part - poll/epoll ABI mess. POLLWR... and POLLRDHUP
are architecture-dependent; EPOLL counterparts are not and both are parts
of ABI. Consider e.g. sparc:
#define POLLWRNORM POLLOUT [4, that is]
#define POLLWRBAND 256
#define POLLMSG 512
#define POLLREMOVE 1024
#define POLLRDHUP 2048
and compare with
#define EPOLLWRNORM 0x00000100
#define EPOLLWRBAND 0x00000200
#define EPOLLMSG 0x00000400
#define EPOLLRDHUP 0x00002000
EPOLLRDHUP is never matched. Neither is EPOLLMSG (nothing raises
POLLREMOVE, but then nothing raises POLLMSG either). EPOLLWRBAND
is not matched either (that would be POLLMSG). And EPOLLWRNORM
is matched when we raise POLLWRBAND.
sparc is the worst case in that respect; mips is somewhat better -
there we have
#define POLLWRNORM POLLOUT
#define POLLWRBAND 0x0100
and everything else is default. IOW, EPOLLWRBAND is never matched
and EPOLLWRNORM is matched when we raise POLLWRBAND. Several other
architectures are like mips (m68k and even more exotic stuff).
I'm not sure what to do about that. Davem is probably in the best
position to tell...
It might be worth merging the infrastructure bits right before -rc1,
maybe this cycle, maybe the next one. It's not that hard to redo
every time, but...
On Wed, Jul 05, 2017 at 11:38:21PM +0100, Al Viro wrote:
> Sure, makes sense - especially since it's not too widely spread yet.
Do you want to do that yourself, or do you want me to look into it?
On Thu, Jul 06, 2017 at 12:52:35AM +0200, Christoph Hellwig wrote:
> On Wed, Jul 05, 2017 at 11:38:21PM +0100, Al Viro wrote:
> > Sure, makes sense - especially since it's not too widely spread yet.
>
> Do you want to do that yourself, or do you want me to look into it?
I'll do it tomorrow, unless you get to it first...
On Thu, Jul 06, 2017 at 12:29:12AM +0100, Al Viro wrote:
> On Thu, Jul 06, 2017 at 12:52:35AM +0200, Christoph Hellwig wrote:
> > On Wed, Jul 05, 2017 at 11:38:21PM +0100, Al Viro wrote:
> > > Sure, makes sense - especially since it's not too widely spread yet.
> >
> > Do you want to do that yourself, or do you want me to look into it?
>
> I'll do it tomorrow, unless you get to it first...
Just did the whole batch (patch below), but it seems like using a
__bitwise type in SYSCALL_DEFINE* will always give warnings like:
fs/read_write.c:1095:1: warning: cast to restricted __kernel_rwf_t
which I'm not sure to deal with..
--
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 38d0383dc7f9..bc69d40c4e8b 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -969,7 +969,7 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
int use_wgather;
loff_t pos = offset;
unsigned int pflags = current->flags;
- int flags = 0;
+ rwf_t flags = 0;
if (test_bit(RQ_LOCAL, &rqstp->rq_flags))
/*
diff --git a/fs/read_write.c b/fs/read_write.c
index a2cbc8303dae..bc2db5e5cd19 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -633,7 +633,7 @@ unsigned long iov_shorten(struct iovec *iov, unsigned long nr_segs, size_t to)
EXPORT_SYMBOL(iov_shorten);
static ssize_t do_iter_readv_writev(struct file *filp, struct iov_iter *iter,
- loff_t *ppos, int type, int flags)
+ loff_t *ppos, int type, rwf_t flags)
{
struct kiocb kiocb;
ssize_t ret;
@@ -655,7 +655,7 @@ static ssize_t do_iter_readv_writev(struct file *filp, struct iov_iter *iter,
/* Do it by hand, with file-ops */
static ssize_t do_loop_readv_writev(struct file *filp, struct iov_iter *iter,
- loff_t *ppos, int type, int flags)
+ loff_t *ppos, int type, rwf_t flags)
{
ssize_t ret = 0;
@@ -871,7 +871,7 @@ ssize_t compat_rw_copy_check_uvector(int type,
#endif
static ssize_t do_iter_read(struct file *file, struct iov_iter *iter,
- loff_t *pos, int flags)
+ loff_t *pos, rwf_t flags)
{
size_t tot_len;
ssize_t ret = 0;
@@ -899,7 +899,7 @@ static ssize_t do_iter_read(struct file *file, struct iov_iter *iter,
}
ssize_t vfs_iter_read(struct file *file, struct iov_iter *iter, loff_t *ppos,
- int flags)
+ rwf_t flags)
{
if (!file->f_op->read_iter)
return -EINVAL;
@@ -908,7 +908,7 @@ ssize_t vfs_iter_read(struct file *file, struct iov_iter *iter, loff_t *ppos,
EXPORT_SYMBOL(vfs_iter_read);
static ssize_t do_iter_write(struct file *file, struct iov_iter *iter,
- loff_t *pos, int flags)
+ loff_t *pos, rwf_t flags)
{
size_t tot_len;
ssize_t ret = 0;
@@ -937,7 +937,7 @@ static ssize_t do_iter_write(struct file *file, struct iov_iter *iter,
}
ssize_t vfs_iter_write(struct file *file, struct iov_iter *iter, loff_t *ppos,
- int flags)
+ rwf_t flags)
{
if (!file->f_op->write_iter)
return -EINVAL;
@@ -946,7 +946,7 @@ ssize_t vfs_iter_write(struct file *file, struct iov_iter *iter, loff_t *ppos,
EXPORT_SYMBOL(vfs_iter_write);
ssize_t vfs_readv(struct file *file, const struct iovec __user *vec,
- unsigned long vlen, loff_t *pos, int flags)
+ unsigned long vlen, loff_t *pos, rwf_t flags)
{
struct iovec iovstack[UIO_FASTIOV];
struct iovec *iov = iovstack;
@@ -964,7 +964,7 @@ ssize_t vfs_readv(struct file *file, const struct iovec __user *vec,
EXPORT_SYMBOL(vfs_readv);
ssize_t vfs_writev(struct file *file, const struct iovec __user *vec,
- unsigned long vlen, loff_t *pos, int flags)
+ unsigned long vlen, loff_t *pos, rwf_t flags)
{
struct iovec iovstack[UIO_FASTIOV];
struct iovec *iov = iovstack;
@@ -981,7 +981,7 @@ ssize_t vfs_writev(struct file *file, const struct iovec __user *vec,
EXPORT_SYMBOL(vfs_writev);
static ssize_t do_readv(unsigned long fd, const struct iovec __user *vec,
- unsigned long vlen, int flags)
+ unsigned long vlen, rwf_t flags)
{
struct fd f = fdget_pos(fd);
ssize_t ret = -EBADF;
@@ -1001,7 +1001,7 @@ static ssize_t do_readv(unsigned long fd, const struct iovec __user *vec,
}
static ssize_t do_writev(unsigned long fd, const struct iovec __user *vec,
- unsigned long vlen, int flags)
+ unsigned long vlen, rwf_t flags)
{
struct fd f = fdget_pos(fd);
ssize_t ret = -EBADF;
@@ -1027,7 +1027,7 @@ static inline loff_t pos_from_hilo(unsigned long high, unsigned long low)
}
static ssize_t do_preadv(unsigned long fd, const struct iovec __user *vec,
- unsigned long vlen, loff_t pos, int flags)
+ unsigned long vlen, loff_t pos, rwf_t flags)
{
struct fd f;
ssize_t ret = -EBADF;
@@ -1050,7 +1050,7 @@ static ssize_t do_preadv(unsigned long fd, const struct iovec __user *vec,
}
static ssize_t do_pwritev(unsigned long fd, const struct iovec __user *vec,
- unsigned long vlen, loff_t pos, int flags)
+ unsigned long vlen, loff_t pos, rwf_t flags)
{
struct fd f;
ssize_t ret = -EBADF;
@@ -1094,7 +1094,7 @@ SYSCALL_DEFINE5(preadv, unsigned long, fd, const struct iovec __user *, vec,
SYSCALL_DEFINE6(preadv2, unsigned long, fd, const struct iovec __user *, vec,
unsigned long, vlen, unsigned long, pos_l, unsigned long, pos_h,
- int, flags)
+ rwf_t, flags)
{
loff_t pos = pos_from_hilo(pos_h, pos_l);
@@ -1114,7 +1114,7 @@ SYSCALL_DEFINE5(pwritev, unsigned long, fd, const struct iovec __user *, vec,
SYSCALL_DEFINE6(pwritev2, unsigned long, fd, const struct iovec __user *, vec,
unsigned long, vlen, unsigned long, pos_l, unsigned long, pos_h,
- int, flags)
+ rwf_t, flags)
{
loff_t pos = pos_from_hilo(pos_h, pos_l);
@@ -1127,7 +1127,7 @@ SYSCALL_DEFINE6(pwritev2, unsigned long, fd, const struct iovec __user *, vec,
#ifdef CONFIG_COMPAT
static size_t compat_readv(struct file *file,
const struct compat_iovec __user *vec,
- unsigned long vlen, loff_t *pos, int flags)
+ unsigned long vlen, loff_t *pos, rwf_t flags)
{
struct iovec iovstack[UIO_FASTIOV];
struct iovec *iov = iovstack;
@@ -1147,7 +1147,7 @@ static size_t compat_readv(struct file *file,
static size_t do_compat_readv(compat_ulong_t fd,
const struct compat_iovec __user *vec,
- compat_ulong_t vlen, int flags)
+ compat_ulong_t vlen, rwf_t flags)
{
struct fd f = fdget_pos(fd);
ssize_t ret;
@@ -1173,7 +1173,7 @@ COMPAT_SYSCALL_DEFINE3(readv, compat_ulong_t, fd,
static long do_compat_preadv64(unsigned long fd,
const struct compat_iovec __user *vec,
- unsigned long vlen, loff_t pos, int flags)
+ unsigned long vlen, loff_t pos, rwf_t flags)
{
struct fd f;
ssize_t ret;
@@ -1211,7 +1211,7 @@ COMPAT_SYSCALL_DEFINE5(preadv, compat_ulong_t, fd,
#ifdef __ARCH_WANT_COMPAT_SYS_PREADV64V2
COMPAT_SYSCALL_DEFINE5(preadv64v2, unsigned long, fd,
const struct compat_iovec __user *,vec,
- unsigned long, vlen, loff_t, pos, int, flags)
+ unsigned long, vlen, loff_t, pos, rwf_t, flags)
{
return do_compat_preadv64(fd, vec, vlen, pos, flags);
}
@@ -1220,7 +1220,7 @@ COMPAT_SYSCALL_DEFINE5(preadv64v2, unsigned long, fd,
COMPAT_SYSCALL_DEFINE6(preadv2, compat_ulong_t, fd,
const struct compat_iovec __user *,vec,
compat_ulong_t, vlen, u32, pos_low, u32, pos_high,
- int, flags)
+ rwf_t, flags)
{
loff_t pos = ((loff_t)pos_high << 32) | pos_low;
@@ -1232,7 +1232,7 @@ COMPAT_SYSCALL_DEFINE6(preadv2, compat_ulong_t, fd,
static size_t compat_writev(struct file *file,
const struct compat_iovec __user *vec,
- unsigned long vlen, loff_t *pos, int flags)
+ unsigned long vlen, loff_t *pos, rwf_t flags)
{
struct iovec iovstack[UIO_FASTIOV];
struct iovec *iov = iovstack;
@@ -1252,7 +1252,7 @@ static size_t compat_writev(struct file *file,
static size_t do_compat_writev(compat_ulong_t fd,
const struct compat_iovec __user* vec,
- compat_ulong_t vlen, int flags)
+ compat_ulong_t vlen, rwf_t flags)
{
struct fd f = fdget_pos(fd);
ssize_t ret;
@@ -1277,7 +1277,7 @@ COMPAT_SYSCALL_DEFINE3(writev, compat_ulong_t, fd,
static long do_compat_pwritev64(unsigned long fd,
const struct compat_iovec __user *vec,
- unsigned long vlen, loff_t pos, int flags)
+ unsigned long vlen, loff_t pos, rwf_t flags)
{
struct fd f;
ssize_t ret;
@@ -1315,7 +1315,7 @@ COMPAT_SYSCALL_DEFINE5(pwritev, compat_ulong_t, fd,
#ifdef __ARCH_WANT_COMPAT_SYS_PWRITEV64V2
COMPAT_SYSCALL_DEFINE5(pwritev64v2, unsigned long, fd,
const struct compat_iovec __user *,vec,
- unsigned long, vlen, loff_t, pos, int, flags)
+ unsigned long, vlen, loff_t, pos, rwf_t, flags)
{
return do_compat_pwritev64(fd, vec, vlen, pos, flags);
}
@@ -1323,7 +1323,7 @@ COMPAT_SYSCALL_DEFINE5(pwritev64v2, unsigned long, fd,
COMPAT_SYSCALL_DEFINE6(pwritev2, compat_ulong_t, fd,
const struct compat_iovec __user *,vec,
- compat_ulong_t, vlen, u32, pos_low, u32, pos_high, int, flags)
+ compat_ulong_t, vlen, u32, pos_low, u32, pos_high, rwf_t, flags)
{
loff_t pos = ((loff_t)pos_high << 32) | pos_low;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 818568c8e5ed..42bef40818bb 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -70,6 +70,8 @@ extern int leases_enable, lease_break_time;
extern int sysctl_protected_symlinks;
extern int sysctl_protected_hardlinks;
+typedef __kernel_rwf_t rwf_t;
+
struct buffer_head;
typedef int (get_block_t)(struct inode *inode, sector_t iblock,
struct buffer_head *bh_result, int create);
@@ -1764,9 +1766,9 @@ extern ssize_t __vfs_write(struct file *, const char __user *, size_t, loff_t *)
extern ssize_t vfs_read(struct file *, char __user *, size_t, loff_t *);
extern ssize_t vfs_write(struct file *, const char __user *, size_t, loff_t *);
extern ssize_t vfs_readv(struct file *, const struct iovec __user *,
- unsigned long, loff_t *, int);
+ unsigned long, loff_t *, rwf_t);
extern ssize_t vfs_writev(struct file *, const struct iovec __user *,
- unsigned long, loff_t *, int);
+ unsigned long, loff_t *, rwf_t);
extern ssize_t vfs_copy_file_range(struct file *, loff_t , struct file *,
loff_t, size_t, unsigned int);
extern int vfs_clone_file_prep_inodes(struct inode *inode_in, loff_t pos_in,
@@ -2820,9 +2822,9 @@ extern ssize_t generic_file_direct_write(struct kiocb *, struct iov_iter *);
extern ssize_t generic_perform_write(struct file *, struct iov_iter *, loff_t);
ssize_t vfs_iter_read(struct file *file, struct iov_iter *iter, loff_t *ppos,
- int flags);
+ rwf_t flags);
ssize_t vfs_iter_write(struct file *file, struct iov_iter *iter, loff_t *ppos,
- int flags);
+ rwf_t flags);
/* fs/block_dev.c */
extern ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to);
@@ -3088,7 +3090,7 @@ static inline int iocb_flags(struct file *file)
return res;
}
-static inline int kiocb_set_rw_flags(struct kiocb *ki, int flags)
+static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
{
if (unlikely(flags & ~RWF_SUPPORTED))
return -EOPNOTSUPP;
diff --git a/include/uapi/linux/aio_abi.h b/include/uapi/linux/aio_abi.h
index a2d4a8ac94ca..a04adbc70ddf 100644
--- a/include/uapi/linux/aio_abi.h
+++ b/include/uapi/linux/aio_abi.h
@@ -28,6 +28,7 @@
#define __LINUX__AIO_ABI_H
#include <linux/types.h>
+#include <linux/fs.h>
#include <asm/byteorder.h>
typedef __kernel_ulong_t aio_context_t;
@@ -62,14 +63,6 @@ struct io_event {
__s64 res2; /* secondary result */
};
-#if defined(__BYTE_ORDER) ? __BYTE_ORDER == __LITTLE_ENDIAN : defined(__LITTLE_ENDIAN)
-#define PADDED(x,y) x, y
-#elif defined(__BYTE_ORDER) ? __BYTE_ORDER == __BIG_ENDIAN : defined(__BIG_ENDIAN)
-#define PADDED(x,y) y, x
-#else
-#error edit for your odd byteorder.
-#endif
-
/*
* we always use a 64bit off_t when communicating
* with userland. its up to libraries to do the
@@ -79,8 +72,16 @@ struct io_event {
struct iocb {
/* these are internal to the kernel/libc. */
__u64 aio_data; /* data to be returned in event's data */
- __u32 PADDED(aio_key, aio_rw_flags);
- /* the kernel sets aio_key to the req # */
+
+#if defined(__BYTE_ORDER) ? __BYTE_ORDER == __LITTLE_ENDIAN : defined(__LITTLE_ENDIAN)
+ __u32 aio_key; /* the kernel sets aio_key to the req # */
+ __kernel_rwf_t aio_rw_flags; /* RWF_* flags */
+#elif defined(__BYTE_ORDER) ? __BYTE_ORDER == __BIG_ENDIAN : defined(__BIG_ENDIAN)
+ __kernel_rwf_t aio_rw_flags; /* RWF_* flags */
+ __u32 aio_key; /* the kernel sets aio_key to the req # */
+#else
+#error edit for your odd byteorder.
+#endif
/* common fields */
__u16 aio_lio_opcode; /* see IOCB_CMD_ above */
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 27d8c36c04af..c5439778a85f 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -356,13 +356,33 @@ struct fscrypt_key {
#define SYNC_FILE_RANGE_WRITE 2
#define SYNC_FILE_RANGE_WAIT_AFTER 4
-/* flags for preadv2/pwritev2: */
-#define RWF_HIPRI 0x00000001 /* high priority request, poll if possible */
-#define RWF_DSYNC 0x00000002 /* per-IO O_DSYNC */
-#define RWF_SYNC 0x00000004 /* per-IO O_SYNC */
-#define RWF_NOWAIT 0x00000008 /* per-IO, return -EAGAIN if operation would block */
-
-#define RWF_SUPPORTED (RWF_HIPRI | RWF_DSYNC | RWF_SYNC |\
- RWF_NOWAIT)
+#define BLK_STS_OK 0
+#define BLK_STS_NOTSUPP ((__force blk_status_t)1)
+#define BLK_STS_TIMEOUT ((__force blk_status_t)2)
+#define BLK_STS_NOSPC ((__force blk_status_t)3)
+#define BLK_STS_TRANSPORT ((__force blk_status_t)4)
+#define BLK_STS_TARGET ((__force blk_status_t)5)
+#define BLK_STS_NEXUS ((__force blk_status_t)6)
+
+/*
+ * Flags for preadv2/pwritev2:
+ */
+
+typedef int __bitwise __kernel_rwf_t;
+
+/* high priority request, poll if possible */
+#define RWF_HIPRI ((__force __kernel_rwf_t)0x00000001)
+
+/* per-IO O_DSYNC */
+#define RWF_DSYNC ((__force __kernel_rwf_t)0x00000002)
+
+/* per-IO O_SYNC */
+#define RWF_SYNC ((__force __kernel_rwf_t)0x00000004)
+
+/* per-IO, return -EAGAIN if operation would block */
+#define RWF_NOWAIT ((__force __kernel_rwf_t)0x00000008)
+
+/* mask of flags supported by the kernel */
+#define RWF_SUPPORTED (RWF_HIPRI | RWF_DSYNC | RWF_SYNC | RWF_NOWAIT)
#endif /* _UAPI_LINUX_FS_H */
On Thu, Jul 06, 2017 at 04:48:40PM +0200, Christoph Hellwig wrote:
> Just did the whole batch (patch below), but it seems like using a
> __bitwise type in SYSCALL_DEFINE* will always give warnings like:
>
> fs/read_write.c:1095:1: warning: cast to restricted __kernel_rwf_t
>
> which I'm not sure to deal with..
#define __SC_CAST(t, a) (__force t) a
in syscalls.h
> index a2d4a8ac94ca..a04adbc70ddf 100644
> --- a/include/uapi/linux/aio_abi.h
> +++ b/include/uapi/linux/aio_abi.h
> @@ -28,6 +28,7 @@
> #define __LINUX__AIO_ABI_H
>
> #include <linux/types.h>
> +#include <linux/fs.h>
Um... Includes of non-uapi in uapi are wrong. What do you need
fs.h for, anyway? Just put the typedef into uapi/linux/types.h
and be done with that...
> +#define BLK_STS_OK 0
> +#define BLK_STS_NOTSUPP ((__force blk_status_t)1)
> +#define BLK_STS_TIMEOUT ((__force blk_status_t)2)
> +#define BLK_STS_NOSPC ((__force blk_status_t)3)
> +#define BLK_STS_TRANSPORT ((__force blk_status_t)4)
> +#define BLK_STS_TARGET ((__force blk_status_t)5)
> +#define BLK_STS_NEXUS ((__force blk_status_t)6)
WTF is that doing here? If nothing else, it's a userland namespace
pollution; typedefs with such names are OK in the kernel, but not
is something that might be included from userland. And that chunk
doesn't seem to have anything to do with the rest of the patch...
Please, move that on top of current #work.read_write - there's
a fix for buggered vfs_write_iter() in it.
On Thu, Jul 06, 2017 at 04:03:30PM +0100, Al Viro wrote:
> #define __SC_CAST(t, a) (__force t) a
>
> in syscalls.h
Hmm.
>
> > index a2d4a8ac94ca..a04adbc70ddf 100644
> > --- a/include/uapi/linux/aio_abi.h
> > +++ b/include/uapi/linux/aio_abi.h
> > @@ -28,6 +28,7 @@
> > #define __LINUX__AIO_ABI_H
> >
> > #include <linux/types.h>
> > +#include <linux/fs.h>
>
> Um... Includes of non-uapi in uapi are wrong. What do you need
> fs.h for, anyway? Just put the typedef into uapi/linux/types.h
> and be done with that...
We automatically get the non-uapi one for userspace. And we do in
fact need to write it that way as it will not show up as uapi/ in
userspace.
But yes, the type could be taken to types.h
>
> > +#define BLK_STS_OK 0
> > +#define BLK_STS_NOTSUPP ((__force blk_status_t)1)
> > +#define BLK_STS_TIMEOUT ((__force blk_status_t)2)
> > +#define BLK_STS_NOSPC ((__force blk_status_t)3)
> > +#define BLK_STS_TRANSPORT ((__force blk_status_t)4)
> > +#define BLK_STS_TARGET ((__force blk_status_t)5)
> > +#define BLK_STS_NEXUS ((__force blk_status_t)6)
>
> WTF is that doing here? If nothing else, it's a userland namespace
> pollution; typedefs with such names are OK in the kernel, but not
> is something that might be included from userland. And that chunk
> doesn't seem to have anything to do with the rest of the patch...
It doesn't that was the copy and paste of the __bitwise boilerplate
I staeted with..
On Thu, Jul 06, 2017 at 04:03:30PM +0100, Al Viro wrote:
> On Thu, Jul 06, 2017 at 04:48:40PM +0200, Christoph Hellwig wrote:
>
> > Just did the whole batch (patch below), but it seems like using a
> > __bitwise type in SYSCALL_DEFINE* will always give warnings like:
> >
> > fs/read_write.c:1095:1: warning: cast to restricted __kernel_rwf_t
> >
> > which I'm not sure to deal with..
>
> #define __SC_CAST(t, a) (__force t) a
doesn't seem to make a difference..
On Thu, Jul 06, 2017 at 05:10:33PM +0200, Christoph Hellwig wrote:
> On Thu, Jul 06, 2017 at 04:03:30PM +0100, Al Viro wrote:
> > On Thu, Jul 06, 2017 at 04:48:40PM +0200, Christoph Hellwig wrote:
> >
> > > Just did the whole batch (patch below), but it seems like using a
> > > __bitwise type in SYSCALL_DEFINE* will always give warnings like:
> > >
> > > fs/read_write.c:1095:1: warning: cast to restricted __kernel_rwf_t
> > >
> > > which I'm not sure to deal with..
> >
> > #define __SC_CAST(t, a) (__force t) a
>
> doesn't seem to make a difference..
Works here...
; cat a.c
#include <linux/syscalls.h>
#undef __SC_CAST
#define __SC_CAST(t, a) (__force t)a
typedef int __bitwise __foo_t;
static __foo_t is_OK;
static int will_warn;
SYSCALL_DEFINE1(foo, __foo_t, arg)
{
is_OK = arg;
will_warn = arg;
return 0;
}
; make a.o C=2 CHECK=~/local/sparse/sparse
CHK include/config/kernel.release
CHK include/generated/uapi/linux/version.h
CHK include/generated/utsrelease.h
CHECK arch/x86/purgatory/purgatory.c
CHECK arch/x86/purgatory/sha256.c
CHECK arch/x86/purgatory/string.c
arch/x86/purgatory/../boot/string.c:165:6: warning: symbol 'strchr' was not declared. Should it be static?
CHK include/generated/bounds.h
CHK include/generated/timeconst.h
CHK include/generated/asm-offsets.h
CALL scripts/checksyscalls.sh
DESCEND objtool
CHECK scripts/mod/empty.c
CHK scripts/mod/devicetable-offsets.h
CHK include/generated/timeconst.h
CHK include/generated/bounds.h
CHECK a.c
a.c:10:19: warning: incorrect type in assignment (different base types)
a.c:10:19: expected int static [signed] [toplevel] will_warn
a.c:10:19: got restricted __foo_t [usertype] arg
CC a.o
;
That - on #work.read_write, as in vfs.git at the moment...
On Thu, Jul 06, 2017 at 04:46:02PM +0100, Al Viro wrote:
> That - on #work.read_write, as in vfs.git at the moment...
... and for COMPAT_SYSCALL you need
#define __SC_DELOUSE(t,v) ((__force t)(unsigned long)(v))
in linux/compat.h
On Thu, Jul 06, 2017 at 04:51:13PM +0100, Al Viro wrote:
> On Thu, Jul 06, 2017 at 04:46:02PM +0100, Al Viro wrote:
>
> > That - on #work.read_write, as in vfs.git at the moment...
>
> ... and for COMPAT_SYSCALL you need
> #define __SC_DELOUSE(t,v) ((__force t)(unsigned long)(v))
> in linux/compat.h
I'm still getting warnings with both these force casts. This is
the current stack:
diff --git a/arch/s390/include/asm/compat.h b/arch/s390/include/asm/compat.h
index 0ddd37e6c29d..019f0de892c3 100644
--- a/arch/s390/include/asm/compat.h
+++ b/arch/s390/include/asm/compat.h
@@ -12,7 +12,7 @@
#define __SC_DELOUSE(t,v) ({ \
BUILD_BUG_ON(sizeof(t) > 4 && !__TYPE_IS_PTR(t)); \
- (t)(__TYPE_IS_PTR(t) ? ((v) & 0x7fffffff) : (v)); \
+ (__force t)(__TYPE_IS_PTR(t) ? ((v) & 0x7fffffff) : (v)); \
})
#define PSW32_MASK_PER 0x40000000UL
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 38d0383dc7f9..bc69d40c4e8b 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -969,7 +969,7 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
int use_wgather;
loff_t pos = offset;
unsigned int pflags = current->flags;
- int flags = 0;
+ rwf_t flags = 0;
if (test_bit(RQ_LOCAL, &rqstp->rq_flags))
/*
diff --git a/fs/read_write.c b/fs/read_write.c
index a2cbc8303dae..bc2db5e5cd19 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -633,7 +633,7 @@ unsigned long iov_shorten(struct iovec *iov, unsigned long nr_segs, size_t to)
EXPORT_SYMBOL(iov_shorten);
static ssize_t do_iter_readv_writev(struct file *filp, struct iov_iter *iter,
- loff_t *ppos, int type, int flags)
+ loff_t *ppos, int type, rwf_t flags)
{
struct kiocb kiocb;
ssize_t ret;
@@ -655,7 +655,7 @@ static ssize_t do_iter_readv_writev(struct file *filp, struct iov_iter *iter,
/* Do it by hand, with file-ops */
static ssize_t do_loop_readv_writev(struct file *filp, struct iov_iter *iter,
- loff_t *ppos, int type, int flags)
+ loff_t *ppos, int type, rwf_t flags)
{
ssize_t ret = 0;
@@ -871,7 +871,7 @@ ssize_t compat_rw_copy_check_uvector(int type,
#endif
static ssize_t do_iter_read(struct file *file, struct iov_iter *iter,
- loff_t *pos, int flags)
+ loff_t *pos, rwf_t flags)
{
size_t tot_len;
ssize_t ret = 0;
@@ -899,7 +899,7 @@ static ssize_t do_iter_read(struct file *file, struct iov_iter *iter,
}
ssize_t vfs_iter_read(struct file *file, struct iov_iter *iter, loff_t *ppos,
- int flags)
+ rwf_t flags)
{
if (!file->f_op->read_iter)
return -EINVAL;
@@ -908,7 +908,7 @@ ssize_t vfs_iter_read(struct file *file, struct iov_iter *iter, loff_t *ppos,
EXPORT_SYMBOL(vfs_iter_read);
static ssize_t do_iter_write(struct file *file, struct iov_iter *iter,
- loff_t *pos, int flags)
+ loff_t *pos, rwf_t flags)
{
size_t tot_len;
ssize_t ret = 0;
@@ -937,7 +937,7 @@ static ssize_t do_iter_write(struct file *file, struct iov_iter *iter,
}
ssize_t vfs_iter_write(struct file *file, struct iov_iter *iter, loff_t *ppos,
- int flags)
+ rwf_t flags)
{
if (!file->f_op->write_iter)
return -EINVAL;
@@ -946,7 +946,7 @@ ssize_t vfs_iter_write(struct file *file, struct iov_iter *iter, loff_t *ppos,
EXPORT_SYMBOL(vfs_iter_write);
ssize_t vfs_readv(struct file *file, const struct iovec __user *vec,
- unsigned long vlen, loff_t *pos, int flags)
+ unsigned long vlen, loff_t *pos, rwf_t flags)
{
struct iovec iovstack[UIO_FASTIOV];
struct iovec *iov = iovstack;
@@ -964,7 +964,7 @@ ssize_t vfs_readv(struct file *file, const struct iovec __user *vec,
EXPORT_SYMBOL(vfs_readv);
ssize_t vfs_writev(struct file *file, const struct iovec __user *vec,
- unsigned long vlen, loff_t *pos, int flags)
+ unsigned long vlen, loff_t *pos, rwf_t flags)
{
struct iovec iovstack[UIO_FASTIOV];
struct iovec *iov = iovstack;
@@ -981,7 +981,7 @@ ssize_t vfs_writev(struct file *file, const struct iovec __user *vec,
EXPORT_SYMBOL(vfs_writev);
static ssize_t do_readv(unsigned long fd, const struct iovec __user *vec,
- unsigned long vlen, int flags)
+ unsigned long vlen, rwf_t flags)
{
struct fd f = fdget_pos(fd);
ssize_t ret = -EBADF;
@@ -1001,7 +1001,7 @@ static ssize_t do_readv(unsigned long fd, const struct iovec __user *vec,
}
static ssize_t do_writev(unsigned long fd, const struct iovec __user *vec,
- unsigned long vlen, int flags)
+ unsigned long vlen, rwf_t flags)
{
struct fd f = fdget_pos(fd);
ssize_t ret = -EBADF;
@@ -1027,7 +1027,7 @@ static inline loff_t pos_from_hilo(unsigned long high, unsigned long low)
}
static ssize_t do_preadv(unsigned long fd, const struct iovec __user *vec,
- unsigned long vlen, loff_t pos, int flags)
+ unsigned long vlen, loff_t pos, rwf_t flags)
{
struct fd f;
ssize_t ret = -EBADF;
@@ -1050,7 +1050,7 @@ static ssize_t do_preadv(unsigned long fd, const struct iovec __user *vec,
}
static ssize_t do_pwritev(unsigned long fd, const struct iovec __user *vec,
- unsigned long vlen, loff_t pos, int flags)
+ unsigned long vlen, loff_t pos, rwf_t flags)
{
struct fd f;
ssize_t ret = -EBADF;
@@ -1094,7 +1094,7 @@ SYSCALL_DEFINE5(preadv, unsigned long, fd, const struct iovec __user *, vec,
SYSCALL_DEFINE6(preadv2, unsigned long, fd, const struct iovec __user *, vec,
unsigned long, vlen, unsigned long, pos_l, unsigned long, pos_h,
- int, flags)
+ rwf_t, flags)
{
loff_t pos = pos_from_hilo(pos_h, pos_l);
@@ -1114,7 +1114,7 @@ SYSCALL_DEFINE5(pwritev, unsigned long, fd, const struct iovec __user *, vec,
SYSCALL_DEFINE6(pwritev2, unsigned long, fd, const struct iovec __user *, vec,
unsigned long, vlen, unsigned long, pos_l, unsigned long, pos_h,
- int, flags)
+ rwf_t, flags)
{
loff_t pos = pos_from_hilo(pos_h, pos_l);
@@ -1127,7 +1127,7 @@ SYSCALL_DEFINE6(pwritev2, unsigned long, fd, const struct iovec __user *, vec,
#ifdef CONFIG_COMPAT
static size_t compat_readv(struct file *file,
const struct compat_iovec __user *vec,
- unsigned long vlen, loff_t *pos, int flags)
+ unsigned long vlen, loff_t *pos, rwf_t flags)
{
struct iovec iovstack[UIO_FASTIOV];
struct iovec *iov = iovstack;
@@ -1147,7 +1147,7 @@ static size_t compat_readv(struct file *file,
static size_t do_compat_readv(compat_ulong_t fd,
const struct compat_iovec __user *vec,
- compat_ulong_t vlen, int flags)
+ compat_ulong_t vlen, rwf_t flags)
{
struct fd f = fdget_pos(fd);
ssize_t ret;
@@ -1173,7 +1173,7 @@ COMPAT_SYSCALL_DEFINE3(readv, compat_ulong_t, fd,
static long do_compat_preadv64(unsigned long fd,
const struct compat_iovec __user *vec,
- unsigned long vlen, loff_t pos, int flags)
+ unsigned long vlen, loff_t pos, rwf_t flags)
{
struct fd f;
ssize_t ret;
@@ -1211,7 +1211,7 @@ COMPAT_SYSCALL_DEFINE5(preadv, compat_ulong_t, fd,
#ifdef __ARCH_WANT_COMPAT_SYS_PREADV64V2
COMPAT_SYSCALL_DEFINE5(preadv64v2, unsigned long, fd,
const struct compat_iovec __user *,vec,
- unsigned long, vlen, loff_t, pos, int, flags)
+ unsigned long, vlen, loff_t, pos, rwf_t, flags)
{
return do_compat_preadv64(fd, vec, vlen, pos, flags);
}
@@ -1220,7 +1220,7 @@ COMPAT_SYSCALL_DEFINE5(preadv64v2, unsigned long, fd,
COMPAT_SYSCALL_DEFINE6(preadv2, compat_ulong_t, fd,
const struct compat_iovec __user *,vec,
compat_ulong_t, vlen, u32, pos_low, u32, pos_high,
- int, flags)
+ rwf_t, flags)
{
loff_t pos = ((loff_t)pos_high << 32) | pos_low;
@@ -1232,7 +1232,7 @@ COMPAT_SYSCALL_DEFINE6(preadv2, compat_ulong_t, fd,
static size_t compat_writev(struct file *file,
const struct compat_iovec __user *vec,
- unsigned long vlen, loff_t *pos, int flags)
+ unsigned long vlen, loff_t *pos, rwf_t flags)
{
struct iovec iovstack[UIO_FASTIOV];
struct iovec *iov = iovstack;
@@ -1252,7 +1252,7 @@ static size_t compat_writev(struct file *file,
static size_t do_compat_writev(compat_ulong_t fd,
const struct compat_iovec __user* vec,
- compat_ulong_t vlen, int flags)
+ compat_ulong_t vlen, rwf_t flags)
{
struct fd f = fdget_pos(fd);
ssize_t ret;
@@ -1277,7 +1277,7 @@ COMPAT_SYSCALL_DEFINE3(writev, compat_ulong_t, fd,
static long do_compat_pwritev64(unsigned long fd,
const struct compat_iovec __user *vec,
- unsigned long vlen, loff_t pos, int flags)
+ unsigned long vlen, loff_t pos, rwf_t flags)
{
struct fd f;
ssize_t ret;
@@ -1315,7 +1315,7 @@ COMPAT_SYSCALL_DEFINE5(pwritev, compat_ulong_t, fd,
#ifdef __ARCH_WANT_COMPAT_SYS_PWRITEV64V2
COMPAT_SYSCALL_DEFINE5(pwritev64v2, unsigned long, fd,
const struct compat_iovec __user *,vec,
- unsigned long, vlen, loff_t, pos, int, flags)
+ unsigned long, vlen, loff_t, pos, rwf_t, flags)
{
return do_compat_pwritev64(fd, vec, vlen, pos, flags);
}
@@ -1323,7 +1323,7 @@ COMPAT_SYSCALL_DEFINE5(pwritev64v2, unsigned long, fd,
COMPAT_SYSCALL_DEFINE6(pwritev2, compat_ulong_t, fd,
const struct compat_iovec __user *,vec,
- compat_ulong_t, vlen, u32, pos_low, u32, pos_high, int, flags)
+ compat_ulong_t, vlen, u32, pos_low, u32, pos_high, rwf_t, flags)
{
loff_t pos = ((loff_t)pos_high << 32) | pos_low;
diff --git a/include/linux/compat.h b/include/linux/compat.h
index 2ed54020ace0..09a975f4102f 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -27,7 +27,7 @@
#endif
#ifndef __SC_DELOUSE
-#define __SC_DELOUSE(t,v) ((t)(unsigned long)(v))
+#define __SC_DELOUSE(t,v) ((__force t)(unsigned long)(v))
#endif
#define COMPAT_SYSCALL_DEFINE0(name) \
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 9380c40b498b..1f217331dae6 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -70,6 +70,8 @@ extern int leases_enable, lease_break_time;
extern int sysctl_protected_symlinks;
extern int sysctl_protected_hardlinks;
+typedef __kernel_rwf_t rwf_t;
+
struct buffer_head;
typedef int (get_block_t)(struct inode *inode, sector_t iblock,
struct buffer_head *bh_result, int create);
@@ -1764,9 +1766,9 @@ extern ssize_t __vfs_write(struct file *, const char __user *, size_t, loff_t *)
extern ssize_t vfs_read(struct file *, char __user *, size_t, loff_t *);
extern ssize_t vfs_write(struct file *, const char __user *, size_t, loff_t *);
extern ssize_t vfs_readv(struct file *, const struct iovec __user *,
- unsigned long, loff_t *, int);
+ unsigned long, loff_t *, rwf_t);
extern ssize_t vfs_writev(struct file *, const struct iovec __user *,
- unsigned long, loff_t *, int);
+ unsigned long, loff_t *, rwf_t);
extern ssize_t vfs_copy_file_range(struct file *, loff_t , struct file *,
loff_t, size_t, unsigned int);
extern int vfs_clone_file_prep_inodes(struct inode *inode_in, loff_t pos_in,
@@ -2820,9 +2822,9 @@ extern ssize_t generic_file_direct_write(struct kiocb *, struct iov_iter *);
extern ssize_t generic_perform_write(struct file *, struct iov_iter *, loff_t);
ssize_t vfs_iter_read(struct file *file, struct iov_iter *iter, loff_t *ppos,
- int flags);
+ rwf_t flags);
ssize_t vfs_iter_write(struct file *file, struct iov_iter *iter, loff_t *ppos,
- int flags);
+ rwf_t flags);
/* fs/block_dev.c */
extern ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to);
@@ -3088,7 +3090,7 @@ static inline int iocb_flags(struct file *file)
return res;
}
-static inline int kiocb_set_rw_flags(struct kiocb *ki, int flags)
+static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
{
if (unlikely(flags & ~RWF_SUPPORTED))
return -EOPNOTSUPP;
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 980c3c9b06f8..1a78dbc6c900 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -104,7 +104,7 @@ union bpf_attr;
#define __TYPE_IS_UL(t) (__same_type((t)0, 0UL))
#define __TYPE_IS_LL(t) (__same_type((t)0, 0LL) || __same_type((t)0, 0ULL))
#define __SC_LONG(t, a) __typeof(__builtin_choose_expr(__TYPE_IS_LL(t), 0LL, 0L)) a
-#define __SC_CAST(t, a) (t) a
+#define __SC_CAST(t, a) (__force t) a
#define __SC_ARGS(t, a) a
#define __SC_TEST(t, a) (void)BUILD_BUG_ON_ZERO(!__TYPE_IS_LL(t) && sizeof(t) > sizeof(long))
diff --git a/include/uapi/linux/aio_abi.h b/include/uapi/linux/aio_abi.h
index a2d4a8ac94ca..a04adbc70ddf 100644
--- a/include/uapi/linux/aio_abi.h
+++ b/include/uapi/linux/aio_abi.h
@@ -28,6 +28,7 @@
#define __LINUX__AIO_ABI_H
#include <linux/types.h>
+#include <linux/fs.h>
#include <asm/byteorder.h>
typedef __kernel_ulong_t aio_context_t;
@@ -62,14 +63,6 @@ struct io_event {
__s64 res2; /* secondary result */
};
-#if defined(__BYTE_ORDER) ? __BYTE_ORDER == __LITTLE_ENDIAN : defined(__LITTLE_ENDIAN)
-#define PADDED(x,y) x, y
-#elif defined(__BYTE_ORDER) ? __BYTE_ORDER == __BIG_ENDIAN : defined(__BIG_ENDIAN)
-#define PADDED(x,y) y, x
-#else
-#error edit for your odd byteorder.
-#endif
-
/*
* we always use a 64bit off_t when communicating
* with userland. its up to libraries to do the
@@ -79,8 +72,16 @@ struct io_event {
struct iocb {
/* these are internal to the kernel/libc. */
__u64 aio_data; /* data to be returned in event's data */
- __u32 PADDED(aio_key, aio_rw_flags);
- /* the kernel sets aio_key to the req # */
+
+#if defined(__BYTE_ORDER) ? __BYTE_ORDER == __LITTLE_ENDIAN : defined(__LITTLE_ENDIAN)
+ __u32 aio_key; /* the kernel sets aio_key to the req # */
+ __kernel_rwf_t aio_rw_flags; /* RWF_* flags */
+#elif defined(__BYTE_ORDER) ? __BYTE_ORDER == __BIG_ENDIAN : defined(__BIG_ENDIAN)
+ __kernel_rwf_t aio_rw_flags; /* RWF_* flags */
+ __u32 aio_key; /* the kernel sets aio_key to the req # */
+#else
+#error edit for your odd byteorder.
+#endif
/* common fields */
__u16 aio_lio_opcode; /* see IOCB_CMD_ above */
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 27d8c36c04af..e8ebc18aa9c9 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -356,13 +356,25 @@ struct fscrypt_key {
#define SYNC_FILE_RANGE_WRITE 2
#define SYNC_FILE_RANGE_WAIT_AFTER 4
-/* flags for preadv2/pwritev2: */
-#define RWF_HIPRI 0x00000001 /* high priority request, poll if possible */
-#define RWF_DSYNC 0x00000002 /* per-IO O_DSYNC */
-#define RWF_SYNC 0x00000004 /* per-IO O_SYNC */
-#define RWF_NOWAIT 0x00000008 /* per-IO, return -EAGAIN if operation would block */
-
-#define RWF_SUPPORTED (RWF_HIPRI | RWF_DSYNC | RWF_SYNC |\
- RWF_NOWAIT)
+/*
+ * Flags for preadv2/pwritev2:
+ */
+
+typedef int __bitwise __kernel_rwf_t;
+
+/* high priority request, poll if possible */
+#define RWF_HIPRI ((__force __kernel_rwf_t)0x00000001)
+
+/* per-IO O_DSYNC */
+#define RWF_DSYNC ((__force __kernel_rwf_t)0x00000002)
+
+/* per-IO O_SYNC */
+#define RWF_SYNC ((__force __kernel_rwf_t)0x00000004)
+
+/* per-IO, return -EAGAIN if operation would block */
+#define RWF_NOWAIT ((__force __kernel_rwf_t)0x00000008)
+
+/* mask of flags supported by the kernel */
+#define RWF_SUPPORTED (RWF_HIPRI | RWF_DSYNC | RWF_SYNC | RWF_NOWAIT)
#endif /* _UAPI_LINUX_FS_H */
On Thu, Jul 06, 2017 at 06:58:37PM +0200, Christoph Hellwig wrote:
> On Thu, Jul 06, 2017 at 04:51:13PM +0100, Al Viro wrote:
> > On Thu, Jul 06, 2017 at 04:46:02PM +0100, Al Viro wrote:
> >
> > > That - on #work.read_write, as in vfs.git at the moment...
> >
> > ... and for COMPAT_SYSCALL you need
> > #define __SC_DELOUSE(t,v) ((__force t)(unsigned long)(v))
> > in linux/compat.h
>
> I'm still getting warnings with both these force casts. This is
> the current stack:
That + Linus' tree as of the end of yesterday =>
CHECK fs/read_write.c
fs/read_write.c:38:29: warning: incorrect type in return expression (different base types)
fs/read_write.c:38:29: expected int
fs/read_write.c:38:29: got restricted fmode_t
fs/read_write.c:38:29: warning: incorrect type in return expression (different base types)
fs/read_write.c:38:29: expected int
fs/read_write.c:38:29: got restricted fmode_t
fs/read_write.c:38:29: warning: incorrect type in return expression (different base types)
fs/read_write.c:38:29: expected int
fs/read_write.c:38:29: got restricted fmode_t
fs/read_write.c:38:29: warning: incorrect type in return expression (different base types)
fs/read_write.c:38:29: expected int
fs/read_write.c:38:29: got restricted fmode_t
All of which are from unsigned_offsets() and that's one case where bool would be better than
int. Switching the return type to bool yields
CHECK fs/read_write.c
CC fs/read_write.o
- no warnings at all.
Which sparse version are you using and what's your .config?
> Which sparse version are you using and what's your .config?
sparse is v0.5.0-62-gce18a90, .config is attached.
On Thu, Jul 06, 2017 at 11:44:49PM +0200, Christoph Hellwig wrote:
> > Which sparse version are you using and what's your .config?
>
> sparse is v0.5.0-62-gce18a90, .config is attached.
Arrgh... OK, I see what's going on. sparse commit affecting that
is "Allow casting to a restricted type if !restricted_value"; it
allows the things like (__le32)0. It's present in sparse.git,
but not in chrisl/sparse.git, which is what you are using.
Anyway, the thing I'd missed kernel-side is this:
#define __TYPE_IS_L(t) (__same_type((t)0, 0L))
#define __TYPE_IS_UL(t) (__same_type((t)0, 0UL))
#define __TYPE_IS_LL(t) (__same_type((t)0, 0LL) || __same_type((t)0, 0ULL))
Let's turn them into
#define __TYPE_AS(t, v) __same_type((__force t)0, v)
#define __TYPE_IS_L(t) (__TYPE_AS(t, 0L))
#define __TYPE_IS_UL(t) (__TYPE_AS(t, 0UL))
#define __TYPE_IS_LL(t) (__TYPE_AS(t, 0LL) || __TYPE_AS(t, 0ULL))
That should do it both for old and for new versions of sparse.
On Fri, Jul 07, 2017 at 12:27:49AM +0100, Al Viro wrote:
> On Thu, Jul 06, 2017 at 11:44:49PM +0200, Christoph Hellwig wrote:
> > > Which sparse version are you using and what's your .config?
> >
> > sparse is v0.5.0-62-gce18a90, .config is attached.
>
> Arrgh... OK, I see what's going on. sparse commit affecting that
> is "Allow casting to a restricted type if !restricted_value"; it
> allows the things like (__le32)0. It's present in sparse.git,
> but not in chrisl/sparse.git, which is what you are using.
So what's the current story on sparse versions to use and releases?
At some point it seemed like upstream sparse was sort of dead
and the chrisl repo was the one to use. It seems the main sparse
repo on git.kernel.org is now identical to the chrisl one, and he
is doing the releases. So maybe I'll just need to upgrade..
On Fri, Jul 7, 2017 at 7:00 AM, Christoph Hellwig <[email protected]> wrote:
>
> So what's the current story on sparse versions to use and releases?
The releases are done way too seldom to be useful, but that may be
improving. There is one fairly imminent, and it's probably a good idea
to just test the current git tree.
Linus
On Fri, Jul 7, 2017 at 8:48 AM, Linus Torvalds
<[email protected]> wrote:
> The releases are done way too seldom to be useful, but that may be
> improving. There is one fairly imminent, and it's probably a good idea
> to just test the current git tree.
Yes guilty of too few releases. We are cutting one release pretty soon.
The currently master branch of sparse git repository has gone to RC4
of of release v0.5.1.
BTW, the current sparse development has move back to the official sparse
repository. The sparse-next is the branch to test the bleeding edge bits.
In theory sparse-next can roll back and rewrite history, there is no guarantee
it will be clean pull. The master branch will never rebase, it will
prove a clean pull.
Chris
On Fri, Jul 07, 2017 at 04:00:02PM +0200, Christoph Hellwig wrote:
> On Fri, Jul 07, 2017 at 12:27:49AM +0100, Al Viro wrote:
> > On Thu, Jul 06, 2017 at 11:44:49PM +0200, Christoph Hellwig wrote:
> > > > Which sparse version are you using and what's your .config?
> > >
> > > sparse is v0.5.0-62-gce18a90, .config is attached.
> >
> > Arrgh... OK, I see what's going on. sparse commit affecting that
> > is "Allow casting to a restricted type if !restricted_value"; it
> > allows the things like (__le32)0. It's present in sparse.git,
> > but not in chrisl/sparse.git, which is what you are using.
>
> So what's the current story on sparse versions to use and releases?
>
> At some point it seemed like upstream sparse was sort of dead
> and the chrisl repo was the one to use. It seems the main sparse
> repo on git.kernel.org is now identical to the chrisl one, and he
> is doing the releases. So maybe I'll just need to upgrade..
OK, here's what I have on top of #for-linus. Handling of __bitwise in
{COMPAT_,}SYSCALL_DEFINE went into the tip of vfs.git#for-linus, works
both for old and for new sparse versions. I'd added annotations of
sys_pwritev2(), etc., in syscalls.h and compat.h, other than that it's
what you'd posted... Care to put your S-o-b on that?
commit 1c0ecce04fc5747dbd0578360895fe910d3124db
Author: Christoph Hellwig <[email protected]>
Date: Thu Jul 6 18:58:37 2017 +0200
annotate RWF_... flags
[AV: added missing annotations in syscalls.h/compat.h]
Signed-off-by: Al Viro <[email protected]>
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 38d0383dc7f9..bc69d40c4e8b 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -969,7 +969,7 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
int use_wgather;
loff_t pos = offset;
unsigned int pflags = current->flags;
- int flags = 0;
+ rwf_t flags = 0;
if (test_bit(RQ_LOCAL, &rqstp->rq_flags))
/*
diff --git a/fs/read_write.c b/fs/read_write.c
index a2cbc8303dae..327d4aeefca0 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -33,7 +33,7 @@ const struct file_operations generic_ro_fops = {
EXPORT_SYMBOL(generic_ro_fops);
-static inline int unsigned_offsets(struct file *file)
+static inline bool unsigned_offsets(struct file *file)
{
return file->f_mode & FMODE_UNSIGNED_OFFSET;
}
@@ -633,7 +633,7 @@ unsigned long iov_shorten(struct iovec *iov, unsigned long nr_segs, size_t to)
EXPORT_SYMBOL(iov_shorten);
static ssize_t do_iter_readv_writev(struct file *filp, struct iov_iter *iter,
- loff_t *ppos, int type, int flags)
+ loff_t *ppos, int type, rwf_t flags)
{
struct kiocb kiocb;
ssize_t ret;
@@ -655,7 +655,7 @@ static ssize_t do_iter_readv_writev(struct file *filp, struct iov_iter *iter,
/* Do it by hand, with file-ops */
static ssize_t do_loop_readv_writev(struct file *filp, struct iov_iter *iter,
- loff_t *ppos, int type, int flags)
+ loff_t *ppos, int type, rwf_t flags)
{
ssize_t ret = 0;
@@ -871,7 +871,7 @@ ssize_t compat_rw_copy_check_uvector(int type,
#endif
static ssize_t do_iter_read(struct file *file, struct iov_iter *iter,
- loff_t *pos, int flags)
+ loff_t *pos, rwf_t flags)
{
size_t tot_len;
ssize_t ret = 0;
@@ -899,7 +899,7 @@ static ssize_t do_iter_read(struct file *file, struct iov_iter *iter,
}
ssize_t vfs_iter_read(struct file *file, struct iov_iter *iter, loff_t *ppos,
- int flags)
+ rwf_t flags)
{
if (!file->f_op->read_iter)
return -EINVAL;
@@ -908,7 +908,7 @@ ssize_t vfs_iter_read(struct file *file, struct iov_iter *iter, loff_t *ppos,
EXPORT_SYMBOL(vfs_iter_read);
static ssize_t do_iter_write(struct file *file, struct iov_iter *iter,
- loff_t *pos, int flags)
+ loff_t *pos, rwf_t flags)
{
size_t tot_len;
ssize_t ret = 0;
@@ -937,7 +937,7 @@ static ssize_t do_iter_write(struct file *file, struct iov_iter *iter,
}
ssize_t vfs_iter_write(struct file *file, struct iov_iter *iter, loff_t *ppos,
- int flags)
+ rwf_t flags)
{
if (!file->f_op->write_iter)
return -EINVAL;
@@ -946,7 +946,7 @@ ssize_t vfs_iter_write(struct file *file, struct iov_iter *iter, loff_t *ppos,
EXPORT_SYMBOL(vfs_iter_write);
ssize_t vfs_readv(struct file *file, const struct iovec __user *vec,
- unsigned long vlen, loff_t *pos, int flags)
+ unsigned long vlen, loff_t *pos, rwf_t flags)
{
struct iovec iovstack[UIO_FASTIOV];
struct iovec *iov = iovstack;
@@ -964,7 +964,7 @@ ssize_t vfs_readv(struct file *file, const struct iovec __user *vec,
EXPORT_SYMBOL(vfs_readv);
ssize_t vfs_writev(struct file *file, const struct iovec __user *vec,
- unsigned long vlen, loff_t *pos, int flags)
+ unsigned long vlen, loff_t *pos, rwf_t flags)
{
struct iovec iovstack[UIO_FASTIOV];
struct iovec *iov = iovstack;
@@ -981,7 +981,7 @@ ssize_t vfs_writev(struct file *file, const struct iovec __user *vec,
EXPORT_SYMBOL(vfs_writev);
static ssize_t do_readv(unsigned long fd, const struct iovec __user *vec,
- unsigned long vlen, int flags)
+ unsigned long vlen, rwf_t flags)
{
struct fd f = fdget_pos(fd);
ssize_t ret = -EBADF;
@@ -1001,7 +1001,7 @@ static ssize_t do_readv(unsigned long fd, const struct iovec __user *vec,
}
static ssize_t do_writev(unsigned long fd, const struct iovec __user *vec,
- unsigned long vlen, int flags)
+ unsigned long vlen, rwf_t flags)
{
struct fd f = fdget_pos(fd);
ssize_t ret = -EBADF;
@@ -1027,7 +1027,7 @@ static inline loff_t pos_from_hilo(unsigned long high, unsigned long low)
}
static ssize_t do_preadv(unsigned long fd, const struct iovec __user *vec,
- unsigned long vlen, loff_t pos, int flags)
+ unsigned long vlen, loff_t pos, rwf_t flags)
{
struct fd f;
ssize_t ret = -EBADF;
@@ -1050,7 +1050,7 @@ static ssize_t do_preadv(unsigned long fd, const struct iovec __user *vec,
}
static ssize_t do_pwritev(unsigned long fd, const struct iovec __user *vec,
- unsigned long vlen, loff_t pos, int flags)
+ unsigned long vlen, loff_t pos, rwf_t flags)
{
struct fd f;
ssize_t ret = -EBADF;
@@ -1094,7 +1094,7 @@ SYSCALL_DEFINE5(preadv, unsigned long, fd, const struct iovec __user *, vec,
SYSCALL_DEFINE6(preadv2, unsigned long, fd, const struct iovec __user *, vec,
unsigned long, vlen, unsigned long, pos_l, unsigned long, pos_h,
- int, flags)
+ rwf_t, flags)
{
loff_t pos = pos_from_hilo(pos_h, pos_l);
@@ -1114,7 +1114,7 @@ SYSCALL_DEFINE5(pwritev, unsigned long, fd, const struct iovec __user *, vec,
SYSCALL_DEFINE6(pwritev2, unsigned long, fd, const struct iovec __user *, vec,
unsigned long, vlen, unsigned long, pos_l, unsigned long, pos_h,
- int, flags)
+ rwf_t, flags)
{
loff_t pos = pos_from_hilo(pos_h, pos_l);
@@ -1127,7 +1127,7 @@ SYSCALL_DEFINE6(pwritev2, unsigned long, fd, const struct iovec __user *, vec,
#ifdef CONFIG_COMPAT
static size_t compat_readv(struct file *file,
const struct compat_iovec __user *vec,
- unsigned long vlen, loff_t *pos, int flags)
+ unsigned long vlen, loff_t *pos, rwf_t flags)
{
struct iovec iovstack[UIO_FASTIOV];
struct iovec *iov = iovstack;
@@ -1147,7 +1147,7 @@ static size_t compat_readv(struct file *file,
static size_t do_compat_readv(compat_ulong_t fd,
const struct compat_iovec __user *vec,
- compat_ulong_t vlen, int flags)
+ compat_ulong_t vlen, rwf_t flags)
{
struct fd f = fdget_pos(fd);
ssize_t ret;
@@ -1173,7 +1173,7 @@ COMPAT_SYSCALL_DEFINE3(readv, compat_ulong_t, fd,
static long do_compat_preadv64(unsigned long fd,
const struct compat_iovec __user *vec,
- unsigned long vlen, loff_t pos, int flags)
+ unsigned long vlen, loff_t pos, rwf_t flags)
{
struct fd f;
ssize_t ret;
@@ -1211,7 +1211,7 @@ COMPAT_SYSCALL_DEFINE5(preadv, compat_ulong_t, fd,
#ifdef __ARCH_WANT_COMPAT_SYS_PREADV64V2
COMPAT_SYSCALL_DEFINE5(preadv64v2, unsigned long, fd,
const struct compat_iovec __user *,vec,
- unsigned long, vlen, loff_t, pos, int, flags)
+ unsigned long, vlen, loff_t, pos, rwf_t, flags)
{
return do_compat_preadv64(fd, vec, vlen, pos, flags);
}
@@ -1220,7 +1220,7 @@ COMPAT_SYSCALL_DEFINE5(preadv64v2, unsigned long, fd,
COMPAT_SYSCALL_DEFINE6(preadv2, compat_ulong_t, fd,
const struct compat_iovec __user *,vec,
compat_ulong_t, vlen, u32, pos_low, u32, pos_high,
- int, flags)
+ rwf_t, flags)
{
loff_t pos = ((loff_t)pos_high << 32) | pos_low;
@@ -1232,7 +1232,7 @@ COMPAT_SYSCALL_DEFINE6(preadv2, compat_ulong_t, fd,
static size_t compat_writev(struct file *file,
const struct compat_iovec __user *vec,
- unsigned long vlen, loff_t *pos, int flags)
+ unsigned long vlen, loff_t *pos, rwf_t flags)
{
struct iovec iovstack[UIO_FASTIOV];
struct iovec *iov = iovstack;
@@ -1252,7 +1252,7 @@ static size_t compat_writev(struct file *file,
static size_t do_compat_writev(compat_ulong_t fd,
const struct compat_iovec __user* vec,
- compat_ulong_t vlen, int flags)
+ compat_ulong_t vlen, rwf_t flags)
{
struct fd f = fdget_pos(fd);
ssize_t ret;
@@ -1277,7 +1277,7 @@ COMPAT_SYSCALL_DEFINE3(writev, compat_ulong_t, fd,
static long do_compat_pwritev64(unsigned long fd,
const struct compat_iovec __user *vec,
- unsigned long vlen, loff_t pos, int flags)
+ unsigned long vlen, loff_t pos, rwf_t flags)
{
struct fd f;
ssize_t ret;
@@ -1315,7 +1315,7 @@ COMPAT_SYSCALL_DEFINE5(pwritev, compat_ulong_t, fd,
#ifdef __ARCH_WANT_COMPAT_SYS_PWRITEV64V2
COMPAT_SYSCALL_DEFINE5(pwritev64v2, unsigned long, fd,
const struct compat_iovec __user *,vec,
- unsigned long, vlen, loff_t, pos, int, flags)
+ unsigned long, vlen, loff_t, pos, rwf_t, flags)
{
return do_compat_pwritev64(fd, vec, vlen, pos, flags);
}
@@ -1323,7 +1323,7 @@ COMPAT_SYSCALL_DEFINE5(pwritev64v2, unsigned long, fd,
COMPAT_SYSCALL_DEFINE6(pwritev2, compat_ulong_t, fd,
const struct compat_iovec __user *,vec,
- compat_ulong_t, vlen, u32, pos_low, u32, pos_high, int, flags)
+ compat_ulong_t, vlen, u32, pos_low, u32, pos_high, rwf_t, flags)
{
loff_t pos = ((loff_t)pos_high << 32) | pos_low;
diff --git a/include/linux/compat.h b/include/linux/compat.h
index e5d3fbe24f7d..3fc433303d7a 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -365,10 +365,10 @@ asmlinkage ssize_t compat_sys_pwritev(compat_ulong_t fd,
compat_ulong_t vlen, u32 pos_low, u32 pos_high);
asmlinkage ssize_t compat_sys_preadv2(compat_ulong_t fd,
const struct compat_iovec __user *vec,
- compat_ulong_t vlen, u32 pos_low, u32 pos_high, int flags);
+ compat_ulong_t vlen, u32 pos_low, u32 pos_high, rwf_t flags);
asmlinkage ssize_t compat_sys_pwritev2(compat_ulong_t fd,
const struct compat_iovec __user *vec,
- compat_ulong_t vlen, u32 pos_low, u32 pos_high, int flags);
+ compat_ulong_t vlen, u32 pos_low, u32 pos_high, rwf_t flags);
#ifdef __ARCH_WANT_COMPAT_SYS_PREADV64
asmlinkage long compat_sys_preadv64(unsigned long fd,
@@ -382,6 +382,18 @@ asmlinkage long compat_sys_pwritev64(unsigned long fd,
unsigned long vlen, loff_t pos);
#endif
+#ifdef __ARCH_WANT_COMPAT_SYS_PREADV64V2
+asmlinkage long compat_sys_readv64v2(unsigned long fd,
+ const struct compat_iovec __user *vec,
+ unsigned long vlen, loff_t pos, rwf_t flags);
+#endif
+
+#ifdef __ARCH_WANT_COMPAT_SYS_PWRITEV64V2
+asmlinkage long compat_sys_pwritev64v2(unsigned long fd,
+ const struct compat_iovec __user *vec,
+ unsigned long vlen, loff_t pos, rwf_t flags);
+#endif
+
asmlinkage long compat_sys_lseek(unsigned int, compat_off_t, unsigned int);
asmlinkage long compat_sys_execve(const char __user *filename, const compat_uptr_t __user *argv,
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 0cfa47125d52..cddebe2a93e3 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -71,6 +71,8 @@ extern int leases_enable, lease_break_time;
extern int sysctl_protected_symlinks;
extern int sysctl_protected_hardlinks;
+typedef __kernel_rwf_t rwf_t;
+
struct buffer_head;
typedef int (get_block_t)(struct inode *inode, sector_t iblock,
struct buffer_head *bh_result, int create);
@@ -1761,9 +1763,9 @@ extern ssize_t __vfs_write(struct file *, const char __user *, size_t, loff_t *)
extern ssize_t vfs_read(struct file *, char __user *, size_t, loff_t *);
extern ssize_t vfs_write(struct file *, const char __user *, size_t, loff_t *);
extern ssize_t vfs_readv(struct file *, const struct iovec __user *,
- unsigned long, loff_t *, int);
+ unsigned long, loff_t *, rwf_t);
extern ssize_t vfs_writev(struct file *, const struct iovec __user *,
- unsigned long, loff_t *, int);
+ unsigned long, loff_t *, rwf_t);
extern ssize_t vfs_copy_file_range(struct file *, loff_t , struct file *,
loff_t, size_t, unsigned int);
extern int vfs_clone_file_prep_inodes(struct inode *inode_in, loff_t pos_in,
@@ -2873,9 +2875,9 @@ extern ssize_t generic_file_direct_write(struct kiocb *, struct iov_iter *);
extern ssize_t generic_perform_write(struct file *, struct iov_iter *, loff_t);
ssize_t vfs_iter_read(struct file *file, struct iov_iter *iter, loff_t *ppos,
- int flags);
+ rwf_t flags);
ssize_t vfs_iter_write(struct file *file, struct iov_iter *iter, loff_t *ppos,
- int flags);
+ rwf_t flags);
/* fs/block_dev.c */
extern ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to);
@@ -3141,7 +3143,7 @@ static inline int iocb_flags(struct file *file)
return res;
}
-static inline int kiocb_set_rw_flags(struct kiocb *ki, int flags)
+static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
{
if (unlikely(flags & ~RWF_SUPPORTED))
return -EOPNOTSUPP;
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 0bc1d2e8cc17..138c94535864 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -579,12 +579,12 @@ asmlinkage long sys_preadv(unsigned long fd, const struct iovec __user *vec,
unsigned long vlen, unsigned long pos_l, unsigned long pos_h);
asmlinkage long sys_preadv2(unsigned long fd, const struct iovec __user *vec,
unsigned long vlen, unsigned long pos_l, unsigned long pos_h,
- int flags);
+ rwf_t flags);
asmlinkage long sys_pwritev(unsigned long fd, const struct iovec __user *vec,
unsigned long vlen, unsigned long pos_l, unsigned long pos_h);
asmlinkage long sys_pwritev2(unsigned long fd, const struct iovec __user *vec,
unsigned long vlen, unsigned long pos_l, unsigned long pos_h,
- int flags);
+ rwf_t flags);
asmlinkage long sys_getcwd(char __user *buf, unsigned long size);
asmlinkage long sys_mkdir(const char __user *pathname, umode_t mode);
asmlinkage long sys_chdir(const char __user *filename);
diff --git a/include/uapi/linux/aio_abi.h b/include/uapi/linux/aio_abi.h
index a2d4a8ac94ca..a04adbc70ddf 100644
--- a/include/uapi/linux/aio_abi.h
+++ b/include/uapi/linux/aio_abi.h
@@ -28,6 +28,7 @@
#define __LINUX__AIO_ABI_H
#include <linux/types.h>
+#include <linux/fs.h>
#include <asm/byteorder.h>
typedef __kernel_ulong_t aio_context_t;
@@ -62,14 +63,6 @@ struct io_event {
__s64 res2; /* secondary result */
};
-#if defined(__BYTE_ORDER) ? __BYTE_ORDER == __LITTLE_ENDIAN : defined(__LITTLE_ENDIAN)
-#define PADDED(x,y) x, y
-#elif defined(__BYTE_ORDER) ? __BYTE_ORDER == __BIG_ENDIAN : defined(__BIG_ENDIAN)
-#define PADDED(x,y) y, x
-#else
-#error edit for your odd byteorder.
-#endif
-
/*
* we always use a 64bit off_t when communicating
* with userland. its up to libraries to do the
@@ -79,8 +72,16 @@ struct io_event {
struct iocb {
/* these are internal to the kernel/libc. */
__u64 aio_data; /* data to be returned in event's data */
- __u32 PADDED(aio_key, aio_rw_flags);
- /* the kernel sets aio_key to the req # */
+
+#if defined(__BYTE_ORDER) ? __BYTE_ORDER == __LITTLE_ENDIAN : defined(__LITTLE_ENDIAN)
+ __u32 aio_key; /* the kernel sets aio_key to the req # */
+ __kernel_rwf_t aio_rw_flags; /* RWF_* flags */
+#elif defined(__BYTE_ORDER) ? __BYTE_ORDER == __BIG_ENDIAN : defined(__BIG_ENDIAN)
+ __kernel_rwf_t aio_rw_flags; /* RWF_* flags */
+ __u32 aio_key; /* the kernel sets aio_key to the req # */
+#else
+#error edit for your odd byteorder.
+#endif
/* common fields */
__u16 aio_lio_opcode; /* see IOCB_CMD_ above */
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 27d8c36c04af..e8ebc18aa9c9 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -356,13 +356,25 @@ struct fscrypt_key {
#define SYNC_FILE_RANGE_WRITE 2
#define SYNC_FILE_RANGE_WAIT_AFTER 4
-/* flags for preadv2/pwritev2: */
-#define RWF_HIPRI 0x00000001 /* high priority request, poll if possible */
-#define RWF_DSYNC 0x00000002 /* per-IO O_DSYNC */
-#define RWF_SYNC 0x00000004 /* per-IO O_SYNC */
-#define RWF_NOWAIT 0x00000008 /* per-IO, return -EAGAIN if operation would block */
-
-#define RWF_SUPPORTED (RWF_HIPRI | RWF_DSYNC | RWF_SYNC |\
- RWF_NOWAIT)
+/*
+ * Flags for preadv2/pwritev2:
+ */
+
+typedef int __bitwise __kernel_rwf_t;
+
+/* high priority request, poll if possible */
+#define RWF_HIPRI ((__force __kernel_rwf_t)0x00000001)
+
+/* per-IO O_DSYNC */
+#define RWF_DSYNC ((__force __kernel_rwf_t)0x00000002)
+
+/* per-IO O_SYNC */
+#define RWF_SYNC ((__force __kernel_rwf_t)0x00000004)
+
+/* per-IO, return -EAGAIN if operation would block */
+#define RWF_NOWAIT ((__force __kernel_rwf_t)0x00000008)
+
+/* mask of flags supported by the kernel */
+#define RWF_SUPPORTED (RWF_HIPRI | RWF_DSYNC | RWF_SYNC | RWF_NOWAIT)
#endif /* _UAPI_LINUX_FS_H */
[davem Cc'd, due to sparc/sockets problem involved]
On Wed, Jul 05, 2017 at 11:38:21PM +0100, Al Viro wrote:
> A side note right back at you - POLL... stuff. I'd redone the old
> "hunt the buggy ->poll() instances down" series (took about 12 hours
> total), got it to the point where all remaining sparse warnings about
> that type are for genuine bugs. It goes like that:
>
> define __poll_t, annotate constants
> Type is controlled by ifdef - it's unsigned int unless CHECK_POLL is
> defined and a bitwise type otherwise.
> ->poll() methods should return __poll_t
> anntotate the places where ->poll() return values go
> annotate poll-related wait keys
> annotate poll_table_struct ->_key
> That ends all infrastructure work. Methods declarations are annotated,
> instances are *not*. Due to that ifdef CHECK_POLL, normal builds, including
> normal sparse builds, are unaffected; with CF=-DCHECK_POLL you get __poll_t
> warnings.
FWIW, I've just updated that queue to -rc1. The branch is currently at #misc.poll
and it's fairly close to "all warnings left are genuine".
drivers/media/pci/saa7164/saa7164-vbi.c:632:24: expected restricted __poll_t
drivers/media/pci/saa7164/saa7164-vbi.c:637:40: expected restricted __poll_t
drivers/media/pci/saa7164/saa7164-vbi.c:647:40: expected restricted __poll_t
drivers/media/platform/exynos-gsc/gsc-m2m.c:718:24: expected restricted __poll_t
drivers/media/platform/s3c-camif/camif-capture.c:602:21: expected restricted __poll_t [usertype] ret
drivers/media/radio/radio-wl1273.c:1088:24: expected restricted __poll_t
drivers/platform/goldfish/goldfish_pipe.c:549:24: expected restricted __poll_t
drivers/tty/n_r3964.c:1241:24: expected restricted __poll_t [assigned] [usertype] result
drivers/uio/uio.c:505:24: expected restricted __poll_t
kernel/trace/ring_buffer.c:640:32: expected restricted __poll_t
sound/core/seq/oss/seq_oss.c:206:24: expected restricted __poll_t
sound/core/seq/seq_clientmgr.c:1092:24: expected restricted __poll_t
These are ->poll() instances returning -E... in some cases. All genuine bugs.
kernel/events/core.c:4561:24: expected restricted __poll_t [usertype] events
kernel/events/ring_buffer.c:22:39: got restricted __poll_t [usertype] <noident>
atomic_{set,xchg}() used on __poll_t; false positives, these two.
drivers/media/i2c/saa6588.c:416:35: right side has type restricted __poll_t
drivers/media/pci/bt8xx/bttv-driver.c:3347:20: got restricted __poll_t [assigned] [usertype] res
drivers/media/pci/bt8xx/bttv-driver.c:3350:19: expected restricted __poll_t
drivers/media/pci/saa7134/saa7134-video.c:1243:16: warning: restricted __poll_t degrades to integer
drivers/media/pci/saa7134/saa7134-video.c:1243:19: expected restricted __poll_t
saa6588_ioctl(, SAA6588_CMD_POLL, ) stores POLL... bitmap in the same field
where other subfunctions store int. Could be annotated away (union in the
structure being filled), but... not much point, TBH. Ugly misannotation,
but no more than that.
fs/fuse/file.c:2761:25: warning: cast from restricted __poll_t
fs/fuse/file.c:2783:30: expected restricted __poll_t
fuse puts POLL... bitmaps on the wire. That's a problem waiting to happen,
in theory - different architectures have different encodings.
fs/eventpoll.c:1168:28: warning: restricted __poll_t degrades to integer
fs/eventpoll.c:1208:57: warning: restricted __poll_t degrades to integer
fs/eventpoll.c:1212:57: warning: restricted __poll_t degrades to integer
fs/eventpoll.c:2054:49: warning: restricted __poll_t degrades to integer
fs/eventpoll.c:2114:37: right side has type restricted __poll_t
fs/eventpoll.c:2130:45: right side has type restricted __poll_t
fs/eventpoll.c:880:18: expected restricted __poll_t [usertype] _key
fs/eventpoll.c:880:18: expected restricted __poll_t [usertype] _key
fs/eventpoll.c:880:18: expected restricted __poll_t [usertype] _key
fs/eventpoll.c:880:18: expected restricted __poll_t [usertype] _key
fs/eventpoll.c:882:41: warning: restricted __poll_t degrades to integer
fs/eventpoll.c:882:41: warning: restricted __poll_t degrades to integer
fs/eventpoll.c:882:41: warning: restricted __poll_t degrades to integer
fs/eventpoll.c:882:41: warning: restricted __poll_t degrades to integer
fs/eventpoll.c:895:39: got restricted __poll_t
fs/eventpoll.c:951:32: expected restricted __poll_t
A bloody mess. This is a genuine ABI problem, and I'm not sure if it
_can_ be fixed. struct epoll_event contains a field with or'ed
EPOLL... constants. They are defined in libc and are arch-independent.
The kernel assumes that POLL... constants are equal to corresponding
EPOLL... ones. Unfortunately, that is not true. First POLL... are
universal and do match EPOLL...; however, starting with POLLWRNORM they
diverge on quite a few architectures.
common bfin,frv,m68k,mips xtensa sparc
WRNORM bit 8 2 2 2
WRBAND bit 9 8 8 8
MSG bit 10 10 10 9
REMOVE bit 12 12 11 10
RDHUP bit 13 13 13 11
Now, POLLREMOVE doesn't have EPOLL... equivalent, but others
do. As the result, blackfin, frv, m68k, mips and xtensa have
EPOLLWRNORM matching POLLWRBAND and EPOLLWRBAND not matching
anything. sparc has EPOLLWRNORM matching POLLWRBAND, EPOLLWRBAND
matching POLLMSG (and never triggered), EPOLLMSG matching POLLREMOVE
(and also never triggered) and EPOLLRDHUP not matching anything.
I don't believe that anything tries to use EPOLLMSG; EPOLLWRBAND
and EPOLLWRNORM might be used (even though our manpage doesn't
document either). EPOLLRDHUP _is_ documented and flat-out does
not work on sparc; the only way to catch POLLRDHUP via epoll
there is to give it a value that is not any of EPOLL... constants.
Hell knows if anything tries to do it there...
Comments?