2018-07-10 00:19:00

by Stephen Rothwell

[permalink] [raw]
Subject: linux-next: manual merge of the vfs tree with the overlayfs tree

Hi all,

Today's linux-next merge of the vfs tree got conflicts in:

fs/open.c

between commit:

d4d6f39c507e ("vfs: optionally don't account file in nr_files")
88498a6bd8d1 ("vfs: simplify dentry_open()")

from the overlayfs tree and commit:

5f0cc0005d2e ("introduce FMODE_OPENED")
13fcca01ef86 ("now we can fold open_check_o_direct() into do_dentry_open()")
7dfb1eeaea37 ("introduce FMODE_CREATED and switch to it")

from the vfs tree.

I *think* I got this right, but please check. Is there some way that
these overlayfs commits can be integrated into the vfs tree development
(or vice versa). Also, I would expect to see *some* Reviewed-by or
Acked-by lines in all this work ...

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging. You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

--
Cheers,
Stephen Rothwell

diff --cc fs/open.c
index 52a1ea584669,39dcbf13031c..000000000000
--- a/fs/open.c
+++ b/fs/open.c
@@@ -731,7 -732,6 +721,7 @@@ static int do_dentry_open(struct file *
static const struct file_operations empty_fops = {};
int error;

- WARN_ON(f->f_mode & ~FMODE_NOACCOUNT);
++ WARN_ON(f->f_mode & ~ (FMODE_NOACCOUNT | FMODE_CREATED));
f->f_mode |= OPEN_FMODE(f->f_flags) | FMODE_LSEEK |
FMODE_PREAD | FMODE_PWRITE;

@@@ -743,8 -743,7 +733,8 @@@
f->f_wb_err = filemap_sample_wb_err(f->f_mapping);

if (unlikely(f->f_flags & O_PATH)) {
+ WARN_ON(f->f_mode & FMODE_NOACCOUNT);
- f->f_mode = FMODE_PATH;
+ f->f_mode = FMODE_PATH | FMODE_OPENED;
f->f_op = &empty_fops;
return 0;
}
@@@ -890,47 -893,14 +884,44 @@@ EXPORT_SYMBOL(file_path)
int vfs_open(const struct path *path, struct file *file,
const struct cred *cred)
{
- struct dentry *dentry = d_real(path->dentry, NULL, file->f_flags, 0);
+ file->f_path = *path;
+ return do_dentry_open(file, d_backing_inode(path->dentry), NULL, cred);
+}

- if (IS_ERR(dentry))
- return PTR_ERR(dentry);
+/**
+ * path_open() - Open an inode by a particular name.
+ * @path: The name of the file.
+ * @flags: The O_ flags used to open this file.
+ * @inode: The inode to open.
+ * @cred: The task's credentials used when opening this file.
+ *
+ * Context: Process context.
+ * Return: A pointer to a struct file or an IS_ERR pointer. Cannot return NULL.
+ */
+struct file *path_open(const struct path *path, int flags, struct inode *inode,
+ const struct cred *cred, bool account)
+{
+ struct file *file;
+ int error;
+
+ file = __get_empty_filp(account);
+ if (IS_ERR(file))
+ return file;

+ file->f_flags = flags;
file->f_path = *path;
- return do_dentry_open(file, d_backing_inode(dentry), NULL, cred);
+ error = do_dentry_open(file, inode, NULL, cred);
+ if (error) {
- put_filp(file);
++ if (file->f_mode & FMODE_OPENED)
++ fput(file);
++ else
++ put_filp(file);
+ return ERR_PTR(error);
+ }
+
- error = open_check_o_direct(file);
- if (error) {
- fput(file);
- file = ERR_PTR(error);
- }
-
+ return file;
}
+EXPORT_SYMBOL(path_open);

struct file *dentry_open(const struct path *path, int flags,
const struct cred *cred)


Attachments:
(No filename) (499.00 B)
OpenPGP digital signature

2018-07-10 15:07:12

by Al Viro

[permalink] [raw]
Subject: Re: linux-next: manual merge of the vfs tree with the overlayfs tree

On Tue, Jul 10, 2018 at 10:17:36AM +1000, Stephen Rothwell wrote:
> --- a/fs/open.c
> +++ b/fs/open.c
> @@@ -731,7 -732,6 +721,7 @@@ static int do_dentry_open(struct file *
> static const struct file_operations empty_fops = {};
> int error;
>
> - WARN_ON(f->f_mode & ~FMODE_NOACCOUNT);
> ++ WARN_ON(f->f_mode & ~ (FMODE_NOACCOUNT | FMODE_CREATED));
> f->f_mode |= OPEN_FMODE(f->f_flags) | FMODE_LSEEK |
> FMODE_PREAD | FMODE_PWRITE;

That part is sane

> +/**
> + * path_open() - Open an inode by a particular name.
> + * @path: The name of the file.
> + * @flags: The O_ flags used to open this file.
> + * @inode: The inode to open.
> + * @cred: The task's credentials used when opening this file.
> + *
> + * Context: Process context.
> + * Return: A pointer to a struct file or an IS_ERR pointer. Cannot return NULL.
> + */
> +struct file *path_open(const struct path *path, int flags, struct inode *inode,
> + const struct cred *cred, bool account)
> +{
> + struct file *file;
> + int error;
> +
> + file = __get_empty_filp(account);
> + if (IS_ERR(file))
> + return file;
>
> + file->f_flags = flags;
> file->f_path = *path;
> - return do_dentry_open(file, d_backing_inode(dentry), NULL, cred);
> + error = do_dentry_open(file, inode, NULL, cred);
> + if (error) {
> - put_filp(file);
> ++ if (file->f_mode & FMODE_OPENED)
> ++ fput(file);
> ++ else
> ++ put_filp(file);
> + return ERR_PTR(error);
> + }
> +
> - error = open_check_o_direct(file);
> - if (error) {
> - fput(file);
> - file = ERR_PTR(error);
> - }
> -
> + return file;
> }
> +EXPORT_SYMBOL(path_open);

First of all, I'm still not at all convinced that this "noaccount" thing is
sane, especially since path_open() is exported. But that aside, __get_empty_filp()
needs to be shot, just for the name and calling conventions alone.

It gets a bullshit argument (bool account) *AND* does not get the argument it
does need. Note that the first thing get_empty_filp() (now __get...) does
is
const struct cred *cred = current_cred();
followed by
f->f_cred = get_cred(cred);

Now look at path_open(). What happens to the cred argument it gets? It goes
to do_dentry_open(), where it gets passed to security_file_open() and not
used by anything else. In security_file_open() we have it passed to
ret = call_int_hook(file_open, 0, file, cred);
and there are three instances of ->file_open() - apparmor, selinux and tomoyo.
The last one ignores cred entirely; the other two do checks based on it,
but *all* of them leave file->f_cred as it was.

This is not a new crap. It had been inherited from dentry_open(), which got it
from "CRED: Pass credentials through dentry_open()" back in 2008. Note that
* among the callers of dentry_open() (20) and vfs_open() (2 more)
only these
fs/cachefiles/rdwr.c:913: file = dentry_open(&path, O_RDWR | O_LARGEFILE, cache->cache_cred);
security/apparmor/file.c:695: devnull = dentry_open(&aa_null, O_RDWR, cred);
security/selinux/hooks.c:2628: devnull = dentry_open(&selinux_null, O_RDWR, cred);
get cred != current_cred(). Which helps masking the issue, but makes the
decision to add that argument (instead of a separate helper) rather dubious.
* overlayfs itself appears to *have* run into the problem, judging
by
old_cred = ovl_override_creds(inode->i_sb);
realfile = path_open(&file->f_path, file->f_flags | O_NOATIME,
realinode, current_cred(), false);
revert_creds(old_cred);
in there.

Folks, if you have to go to that kind of contortions, why not do it right?
* add static __alloc_file(cred), which would get cred pointer (and not
use current_cred() internally), allocated a file (without bothering with
nr_files) and returned it
* have alloc_empty_file(cred) that would do the song and dance
with nr_files (and used __alloc_file() internally).
* use that as a replacement for get_empty_filp() - path_openat() would
*probably* use current_cred() for argument, alloc_file() definitely would and
dentry_open() would pass its cred argument.
* in internal.h, static inline alloc_empty_file_noaccount(cred) would
use __alloc_file() and set FMODE_NOACCOUNT in case of success.
* do_dentry_open() loses the fucking cred argument - it should be in
file->f_cred.
* vfs_open() goes away - in your branch it's absolutely pointless.
* path_open() loses its 'account' argument - it's always false.
Uses alloc_empty_file_noaccount() to allocate the sucker. And for fsck
sake, pass it the creds you want to use rather than playing that kind of
games with override/revert.

2018-07-11 02:12:33

by Al Viro

[permalink] [raw]
Subject: Re: linux-next: manual merge of the vfs tree with the overlayfs tree

On Tue, Jul 10, 2018 at 04:04:55PM +0100, Al Viro wrote:
> First of all, I'm still not at all convinced that this "noaccount" thing is
> sane, especially since path_open() is exported. But that aside, __get_empty_filp()
> needs to be shot, just for the name and calling conventions alone.
>
> It gets a bullshit argument (bool account) *AND* does not get the argument it
> does need. Note that the first thing get_empty_filp() (now __get...) does
> is
> const struct cred *cred = current_cred();
> followed by
> f->f_cred = get_cred(cred);
>
> Now look at path_open(). What happens to the cred argument it gets? It goes
> to do_dentry_open(), where it gets passed to security_file_open() and not
> used by anything else. In security_file_open() we have it passed to
> ret = call_int_hook(file_open, 0, file, cred);
> and there are three instances of ->file_open() - apparmor, selinux and tomoyo.
> The last one ignores cred entirely; the other two do checks based on it,
> but *all* of them leave file->f_cred as it was.
>
> This is not a new crap. It had been inherited from dentry_open(), which got it
> from "CRED: Pass credentials through dentry_open()" back in 2008. Note that
> * among the callers of dentry_open() (20) and vfs_open() (2 more)
> only these
> fs/cachefiles/rdwr.c:913: file = dentry_open(&path, O_RDWR | O_LARGEFILE, cache->cache_cred);
> security/apparmor/file.c:695: devnull = dentry_open(&aa_null, O_RDWR, cred);
> security/selinux/hooks.c:2628: devnull = dentry_open(&selinux_null, O_RDWR, cred);
> get cred != current_cred(). Which helps masking the issue, but makes the
> decision to add that argument (instead of a separate helper) rather dubious.
> * overlayfs itself appears to *have* run into the problem, judging
> by
> old_cred = ovl_override_creds(inode->i_sb);
> realfile = path_open(&file->f_path, file->f_flags | O_NOATIME,
> realinode, current_cred(), false);
> revert_creds(old_cred);
> in there.
>
> Folks, if you have to go to that kind of contortions, why not do it right?
> * add static __alloc_file(cred), which would get cred pointer (and not
> use current_cred() internally), allocated a file (without bothering with
> nr_files) and returned it
> * have alloc_empty_file(cred) that would do the song and dance
> with nr_files (and used __alloc_file() internally).
> * use that as a replacement for get_empty_filp() - path_openat() would
> *probably* use current_cred() for argument, alloc_file() definitely would and
> dentry_open() would pass its cred argument.
> * in internal.h, static inline alloc_empty_file_noaccount(cred) would
> use __alloc_file() and set FMODE_NOACCOUNT in case of success.
> * do_dentry_open() loses the fucking cred argument - it should be in
> file->f_cred.
> * vfs_open() goes away - in your branch it's absolutely pointless.
> * path_open() loses its 'account' argument - it's always false.
> Uses alloc_empty_file_noaccount() to allocate the sucker. And for fsck
> sake, pass it the creds you want to use rather than playing that kind of
> games with override/revert.

FWIW, see vfs.git#work.open2 for part of that program. I have not pulled
the overlayfs stuff in, but doing the rest based at circa e9cf4c40af4c
("fold put_filp() into fput()") is trivial. Remains to be done out of the
above:
* add static __alloc_file(cred), which would be basically
alloc_empty_file() sans nr_files-related parts. Make alloc_empty_file()
call it for actual allocation.
* in internal.h, static inline alloc_empty_file_noaccount(cred) would
use __alloc_file() and set FMODE_NOACCOUNT in case of success.
* path_open() loses its 'account' argument - it's always false.
Uses alloc_empty_file_noaccount() to allocate the sucker. And for fsck
sake, pass it the creds you want to use rather than playing that kind of
games with override/revert.
* (after the overlayfs changes to vfs_open()) vfs_open() goes away,
expanded into callers.

Stuff currently in work.open2:

Preparatory fixes (this cycle fodder):
drm_mode_create_lease_ioctl(): fix open-coded filp_clone_open()
cxl_getfile(): fix double-iput() on alloc_file() failures
ocxlflash_getfile(): fix double-iput() on alloc_file() failures
More preparatory massage:
make get_empty_filp() to call file_free_rcu() directly
fold security_file_free() into file_free()
turn filp_clone_open() into inline wrapper for dentry_open()
create_pipe_files(): use fput() if allocation of the second file fails
make sure do_dentry_open() won't return positive as an error
Providing the right ->f_cred:
pass creds to get_empty_filp(), make sure dentry_open() passes the right creds
get rid of cred argument of vfs_open() and do_dentry_open()
security_file_open(): lose cred argument
->file_open(): lose cred argument
Mirror the "do we need fput() or do we need put_filp()" in ->f_mode:
introduce FMODE_OPENED
fold put_filp() into fput()
At that point we can start simplifying open-related paths, now that
cleanups are a lot more uniform:
lift fput() on late failures into path_openat()
now we can fold open_check_o_direct() into do_dentry_open()
switch all remaining checks for FILE_OPENED to FMODE_OPENED
Half of the reasons for ->atomic_open() 'opened' argument is gone, let's deal with
the rest:
introduce FMODE_CREATED and switch to it
IMA: don't propagate opened through the entire thing
getting rid of 'opened' argument of ->atomic_open() - step 1
getting rid of 'opened' argument of ->atomic_open() - part 2
get rid of 'opened' argument of ->atomic_open() - part 3
get rid of 'opened' in path_openat() and the helpers downstream
->atomic_open(): return 0 in all success cases
document ->atomic_open() changes
switch atomic_open() and lookup_open() to returning 0 in all success cases
kill FILE_{CREATED,OPENED}
At that point we've got _much_ simpler ->atomic_open() calling conventions as well
as the code using it. Next part - alloc_file() calling conventions:
new wrapper: alloc_file_pseudo()
__shmem_file_setup(): reorder allocations
... and switch shmem_file_setup() to alloc_file_pseudo()
cxl_getfile(): switch to alloc_file_pseudo()
ocxlflash_getfile(): switch to alloc_file_pseudo()
hugetlb_file_setup(): switch to alloc_file_pseudo()
anon_inode_getfile(): switch to alloc_file_pseudo()
create_pipe_files(): switch the first allocation to alloc_file_pseudo()
new helper: alloc_file_clone()
do_shmat(): grab shp->shm_file earlier, switch to alloc_file_clone()
make alloc_file() static
document alloc_file() changes
And cleanups in pathname-resolving loops from better calling conventions
for path_init()/link_path_walk():
make path_init() unconditionally paired with terminate_walk()
allow link_path_walk() to take ERR_PTR()
few more cleanups of link_path_walk() callers

IMO the diffstat is not too bad -

Documentation/filesystems/Locking | 2 +-
Documentation/filesystems/porting | 20 +++
Documentation/filesystems/vfs.txt | 18 +--
drivers/gpu/drm/drm_lease.c | 16 +--
drivers/misc/cxl/api.c | 21 +---
drivers/scsi/cxlflash/ocxl_hw.c | 24 +---
fs/9p/vfs_inode.c | 7 +-
fs/9p/vfs_inode_dotl.c | 7 +-
fs/aio.c | 26 +---
fs/anon_inodes.c | 29 +----
fs/bad_inode.c | 2 +-
fs/binfmt_misc.c | 2 +-
fs/ceph/file.c | 7 +-
fs/ceph/super.h | 3 +-
fs/cifs/cifsfs.h | 3 +-
fs/cifs/dir.c | 7 +-
fs/file_table.c | 71 +++++++----
fs/fuse/dir.c | 10 +-
fs/gfs2/inode.c | 32 +++--
fs/hugetlbfs/inode.c | 55 +++------
fs/internal.h | 6 +-
fs/namei.c | 223 +++++++++++++---------------------
fs/nfs/dir.c | 14 ++-
fs/nfs/nfs4_fs.h | 2 +-
fs/nfs/nfs4proc.c | 2 +-
fs/nfsd/vfs.c | 2 +-
fs/open.c | 86 ++++---------
fs/pipe.c | 38 ++----
include/linux/file.h | 8 +-
include/linux/fs.h | 16 +--
include/linux/ima.h | 4 +-
include/linux/lsm_hooks.h | 2 +-
include/linux/security.h | 5 +-
ipc/shm.c | 39 +++---
mm/shmem.c | 50 ++------
net/socket.c | 27 +---
security/apparmor/lsm.c | 4 +-
security/integrity/ima/ima.h | 4 +-
security/integrity/ima/ima_appraise.c | 4 +-
security/integrity/ima/ima_main.c | 16 +--
security/security.c | 4 +-
security/selinux/hooks.c | 4 +-
security/smack/smack_lsm.c | 6 +-
security/tomoyo/tomoyo.c | 2 +-
44 files changed, 360 insertions(+), 570 deletions(-)

2018-07-11 02:23:09

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 38/42] make alloc_file() static

From: Al Viro <[email protected]>

Signed-off-by: Al Viro <[email protected]>
---
fs/file_table.c | 3 +--
include/linux/file.h | 2 --
2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index 9097a6fb5a2f..c5f651fd6830 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -155,7 +155,7 @@ struct file *alloc_empty_file(const struct cred *cred)
* @mode: the mode with which the new file will be opened
* @fop: the 'struct file_operations' for the new file
*/
-struct file *alloc_file(const struct path *path, fmode_t mode,
+static struct file *alloc_file(const struct path *path, fmode_t mode,
const struct file_operations *fop)
{
struct file *file;
@@ -180,7 +180,6 @@ struct file *alloc_file(const struct path *path, fmode_t mode,
i_readcount_inc(path->dentry->d_inode);
return file;
}
-EXPORT_SYMBOL(alloc_file);

struct file *alloc_file_pseudo(struct inode *inode, struct vfsmount *mnt,
const char *name, fmode_t mode,
diff --git a/include/linux/file.h b/include/linux/file.h
index 22becbfd4cec..325b36ca336d 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -19,8 +19,6 @@ struct vfsmount;
struct dentry;
struct inode;
struct path;
-extern struct file *alloc_file(const struct path *, fmode_t mode,
- const struct file_operations *fop);
extern struct file *alloc_file_pseudo(struct inode *, struct vfsmount *,
const char *, fmode_t, const struct file_operations *);
extern struct file *alloc_file_clone(struct file *, fmode_t,
--
2.11.0


2018-07-11 02:23:17

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 40/42] make path_init() unconditionally paired with terminate_walk()

From: Al Viro <[email protected]>

including the failure exits

Signed-off-by: Al Viro <[email protected]>
---
fs/namei.c | 32 ++++++++++++++++----------------
1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index ca3f4dec8cda..a9a2a8ac8b9d 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2125,12 +2125,15 @@ static int link_path_walk(const char *name, struct nameidata *nd)
}
}

+/* must be paired with terminate_walk() */
static const char *path_init(struct nameidata *nd, unsigned flags)
{
const char *s = nd->name->name;

if (!*s)
flags &= ~LOOKUP_RCU;
+ if (flags & LOOKUP_RCU)
+ rcu_read_lock();

nd->last_type = LAST_ROOT; /* if there are only slashes... */
nd->flags = flags | LOOKUP_JUMPED | LOOKUP_PARENT;
@@ -2143,7 +2146,6 @@ static const char *path_init(struct nameidata *nd, unsigned flags)
nd->path = nd->root;
nd->inode = inode;
if (flags & LOOKUP_RCU) {
- rcu_read_lock();
nd->seq = __read_seqcount_begin(&nd->path.dentry->d_seq);
nd->root_seq = nd->seq;
nd->m_seq = read_seqbegin(&mount_lock);
@@ -2159,21 +2161,15 @@ static const char *path_init(struct nameidata *nd, unsigned flags)

nd->m_seq = read_seqbegin(&mount_lock);
if (*s == '/') {
- if (flags & LOOKUP_RCU)
- rcu_read_lock();
set_root(nd);
if (likely(!nd_jump_root(nd)))
return s;
- nd->root.mnt = NULL;
- rcu_read_unlock();
return ERR_PTR(-ECHILD);
} else if (nd->dfd == AT_FDCWD) {
if (flags & LOOKUP_RCU) {
struct fs_struct *fs = current->fs;
unsigned seq;

- rcu_read_lock();
-
do {
seq = read_seqcount_begin(&fs->seq);
nd->path = fs->pwd;
@@ -2195,16 +2191,13 @@ static const char *path_init(struct nameidata *nd, unsigned flags)

dentry = f.file->f_path.dentry;

- if (*s) {
- if (!d_can_lookup(dentry)) {
- fdput(f);
- return ERR_PTR(-ENOTDIR);
- }
+ if (*s && unlikely(!d_can_lookup(dentry))) {
+ fdput(f);
+ return ERR_PTR(-ENOTDIR);
}

nd->path = f.file->f_path;
if (flags & LOOKUP_RCU) {
- rcu_read_lock();
nd->inode = nd->path.dentry->d_inode;
nd->seq = read_seqcount_begin(&nd->path.dentry->d_seq);
} else {
@@ -2272,8 +2265,10 @@ static int path_lookupat(struct nameidata *nd, unsigned flags, struct path *path
const char *s = path_init(nd, flags);
int err;

- if (IS_ERR(s))
+ if (IS_ERR(s)) {
+ terminate_walk(nd);
return PTR_ERR(s);
+ }

if (unlikely(flags & LOOKUP_DOWN)) {
err = handle_lookup_down(nd);
@@ -2337,8 +2332,10 @@ static int path_parentat(struct nameidata *nd, unsigned flags,
{
const char *s = path_init(nd, flags);
int err;
- if (IS_ERR(s))
+ if (IS_ERR(s)) {
+ terminate_walk(nd);
return PTR_ERR(s);
+ }
err = link_path_walk(s, nd);
if (!err)
err = complete_walk(nd);
@@ -2666,8 +2663,10 @@ path_mountpoint(struct nameidata *nd, unsigned flags, struct path *path)
{
const char *s = path_init(nd, flags);
int err;
- if (IS_ERR(s))
+ if (IS_ERR(s)) {
+ terminate_walk(nd);
return PTR_ERR(s);
+ }
while (!(err = link_path_walk(s, nd)) &&
(err = mountpoint_last(nd)) > 0) {
s = trailing_symlink(nd);
@@ -3514,6 +3513,7 @@ static struct file *path_openat(struct nameidata *nd,

s = path_init(nd, flags);
if (IS_ERR(s)) {
+ terminate_walk(nd);
fput(file);
return ERR_CAST(s);
}
--
2.11.0


2018-07-11 02:23:29

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 36/42] new helper: alloc_file_clone()

From: Al Viro <[email protected]>

alloc_file_clone(old_file, mode, ops): create a new struct file with
->f_path equal to that of old_file. pipe converted.

Signed-off-by: Al Viro <[email protected]>
---
fs/file_table.c | 11 +++++++++++
fs/pipe.c | 4 +---
include/linux/file.h | 2 ++
3 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index 23bc46ea5f4b..9097a6fb5a2f 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -209,6 +209,17 @@ struct file *alloc_file_pseudo(struct inode *inode, struct vfsmount *mnt,
}
EXPORT_SYMBOL(alloc_file_pseudo);

+struct file *alloc_file_clone(struct file *base, fmode_t mode,
+ const struct file_operations *fops)
+{
+ struct file *f = alloc_file(&base->f_path, mode, fops);
+ if (!IS_ERR(f)) {
+ path_get(&f->f_path);
+ f->f_mapping = base->f_mapping;
+ }
+ return f;
+}
+
/* the real guts of fput() - releasing the last reference to file
*/
static void __fput(struct file *file)
diff --git a/fs/pipe.c b/fs/pipe.c
index c8c4265bf87b..94323a1d7c92 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -760,14 +760,12 @@ int create_pipe_files(struct file **res, int flags)
f->f_flags = O_WRONLY | (flags & (O_NONBLOCK | O_DIRECT));
f->private_data = inode->i_pipe;

- res[0] = alloc_file(&f->f_path, FMODE_READ, &pipefifo_fops);
+ res[0] = alloc_file_clone(f, FMODE_READ, &pipefifo_fops);
if (IS_ERR(res[0])) {
put_pipe_info(inode, inode->i_pipe);
fput(f);
return PTR_ERR(res[0]);
}
-
- path_get(&f->f_path);
res[0]->private_data = inode->i_pipe;
res[0]->f_flags = O_RDONLY | (flags & O_NONBLOCK);
res[1] = f;
diff --git a/include/linux/file.h b/include/linux/file.h
index 0a4ef2315f75..22becbfd4cec 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -23,6 +23,8 @@ extern struct file *alloc_file(const struct path *, fmode_t mode,
const struct file_operations *fop);
extern struct file *alloc_file_pseudo(struct inode *, struct vfsmount *,
const char *, fmode_t, const struct file_operations *);
+extern struct file *alloc_file_clone(struct file *, fmode_t,
+ const struct file_operations *);

static inline void fput_light(struct file *file, int fput_needed)
{
--
2.11.0


2018-07-11 02:23:54

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 42/42] few more cleanups of link_path_walk() callers

From: Al Viro <[email protected]>

Signed-off-by: Al Viro <[email protected]>
---
fs/namei.c | 30 +++++++++++-------------------
1 file changed, 11 insertions(+), 19 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index a55da43f169c..f032a30fc8c6 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2269,10 +2269,8 @@ static int path_lookupat(struct nameidata *nd, unsigned flags, struct path *path

if (unlikely(flags & LOOKUP_DOWN) && !IS_ERR(s)) {
err = handle_lookup_down(nd);
- if (unlikely(err < 0)) {
- terminate_walk(nd);
- return err;
- }
+ if (unlikely(err < 0))
+ s = ERR_PTR(err);
}

while (!(err = link_path_walk(s, nd))
@@ -3472,7 +3470,6 @@ static int do_o_path(struct nameidata *nd, unsigned flags, struct file *file)
static struct file *path_openat(struct nameidata *nd,
const struct open_flags *op, unsigned flags)
{
- const char *s;
struct file *file;
int error;

@@ -3484,22 +3481,17 @@ static struct file *path_openat(struct nameidata *nd,

if (unlikely(file->f_flags & __O_TMPFILE)) {
error = do_tmpfile(nd, flags, op, file);
- goto out2;
- }
-
- if (unlikely(file->f_flags & O_PATH)) {
+ } else if (unlikely(file->f_flags & O_PATH)) {
error = do_o_path(nd, flags, file);
- goto out2;
- }
-
- s = path_init(nd, flags);
- while (!(error = link_path_walk(s, nd)) &&
- (error = do_last(nd, file, op)) > 0) {
- nd->flags &= ~(LOOKUP_OPEN|LOOKUP_CREATE|LOOKUP_EXCL);
- s = trailing_symlink(nd);
+ } else {
+ const char *s = path_init(nd, flags);
+ while (!(error = link_path_walk(s, nd)) &&
+ (error = do_last(nd, file, op)) > 0) {
+ nd->flags &= ~(LOOKUP_OPEN|LOOKUP_CREATE|LOOKUP_EXCL);
+ s = trailing_symlink(nd);
+ }
+ terminate_walk(nd);
}
- terminate_walk(nd);
-out2:
if (likely(!error)) {
if (likely(file->f_mode & FMODE_OPENED))
return file;
--
2.11.0


2018-07-11 02:23:56

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 41/42] allow link_path_walk() to take ERR_PTR()

From: Al Viro <[email protected]>

There is a check for IS_ERR(name) immediately upstream of each call
of link_path_walk(name, nd), with positives treated as if link_path_walk()
failed with PTR_ERR(name). Taking that check into link_path_walk() itself
simplifies things nicely.

Signed-off-by: Al Viro <[email protected]>
---
fs/namei.c | 38 +++++---------------------------------
1 file changed, 5 insertions(+), 33 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index a9a2a8ac8b9d..a55da43f169c 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2028,6 +2028,8 @@ static int link_path_walk(const char *name, struct nameidata *nd)
{
int err;

+ if (IS_ERR(name))
+ return PTR_ERR(name);
while (*name=='/')
name++;
if (!*name)
@@ -2265,12 +2267,7 @@ static int path_lookupat(struct nameidata *nd, unsigned flags, struct path *path
const char *s = path_init(nd, flags);
int err;

- if (IS_ERR(s)) {
- terminate_walk(nd);
- return PTR_ERR(s);
- }
-
- if (unlikely(flags & LOOKUP_DOWN)) {
+ if (unlikely(flags & LOOKUP_DOWN) && !IS_ERR(s)) {
err = handle_lookup_down(nd);
if (unlikely(err < 0)) {
terminate_walk(nd);
@@ -2281,10 +2278,6 @@ static int path_lookupat(struct nameidata *nd, unsigned flags, struct path *path
while (!(err = link_path_walk(s, nd))
&& ((err = lookup_last(nd)) > 0)) {
s = trailing_symlink(nd);
- if (IS_ERR(s)) {
- err = PTR_ERR(s);
- break;
- }
}
if (!err)
err = complete_walk(nd);
@@ -2331,12 +2324,7 @@ static int path_parentat(struct nameidata *nd, unsigned flags,
struct path *parent)
{
const char *s = path_init(nd, flags);
- int err;
- if (IS_ERR(s)) {
- terminate_walk(nd);
- return PTR_ERR(s);
- }
- err = link_path_walk(s, nd);
+ int err = link_path_walk(s, nd);
if (!err)
err = complete_walk(nd);
if (!err) {
@@ -2663,17 +2651,10 @@ path_mountpoint(struct nameidata *nd, unsigned flags, struct path *path)
{
const char *s = path_init(nd, flags);
int err;
- if (IS_ERR(s)) {
- terminate_walk(nd);
- return PTR_ERR(s);
- }
+
while (!(err = link_path_walk(s, nd)) &&
(err = mountpoint_last(nd)) > 0) {
s = trailing_symlink(nd);
- if (IS_ERR(s)) {
- err = PTR_ERR(s);
- break;
- }
}
if (!err) {
*path = nd->path;
@@ -3512,19 +3493,10 @@ static struct file *path_openat(struct nameidata *nd,
}

s = path_init(nd, flags);
- if (IS_ERR(s)) {
- terminate_walk(nd);
- fput(file);
- return ERR_CAST(s);
- }
while (!(error = link_path_walk(s, nd)) &&
(error = do_last(nd, file, op)) > 0) {
nd->flags &= ~(LOOKUP_OPEN|LOOKUP_CREATE|LOOKUP_EXCL);
s = trailing_symlink(nd);
- if (IS_ERR(s)) {
- error = PTR_ERR(s);
- break;
- }
}
terminate_walk(nd);
out2:
--
2.11.0


2018-07-11 02:24:52

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 32/42] ocxlflash_getfile(): switch to alloc_file_pseudo()

From: Al Viro <[email protected]>

Signed-off-by: Al Viro <[email protected]>
---
drivers/scsi/cxlflash/ocxl_hw.c | 23 ++++-------------------
1 file changed, 4 insertions(+), 19 deletions(-)

diff --git a/drivers/scsi/cxlflash/ocxl_hw.c b/drivers/scsi/cxlflash/ocxl_hw.c
index 497a68389461..6d0632174ec6 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.c
+++ b/drivers/scsi/cxlflash/ocxl_hw.c
@@ -88,10 +88,8 @@ static struct file *ocxlflash_getfile(struct device *dev, const char *name,
const struct file_operations *fops,
void *priv, int flags)
{
- struct qstr this;
- struct path path;
struct file *file;
- struct inode *inode = NULL;
+ struct inode *inode;
int rc;

if (fops->owner && !try_module_get(fops->owner)) {
@@ -116,26 +114,13 @@ static struct file *ocxlflash_getfile(struct device *dev, const char *name,
goto err3;
}

- this.name = name;
- this.len = strlen(name);
- this.hash = 0;
- path.dentry = d_alloc_pseudo(ocxlflash_vfs_mount->mnt_sb, &this);
- if (!path.dentry) {
- dev_err(dev, "%s: d_alloc_pseudo failed\n", __func__);
- rc = -ENOMEM;
- goto err4;
- }
-
- path.mnt = mntget(ocxlflash_vfs_mount);
- d_instantiate(path.dentry, inode);
-
- file = alloc_file(&path, OPEN_FMODE(flags), fops);
+ file = alloc_file_pseudo(inode, ocxlflash_vfs_mount, name,
+ OPEN_FMODE(flags), fops);
if (IS_ERR(file)) {
rc = PTR_ERR(file);
dev_err(dev, "%s: alloc_file failed rc=%d\n",
__func__, rc);
- path_put(&path);
- goto err3;
+ goto err4;
}

file->f_flags = flags & (O_ACCMODE | O_NONBLOCK);
--
2.11.0


2018-07-11 02:24:52

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 39/42] document alloc_file() changes

From: Al Viro <[email protected]>

Signed-off-by: Al Viro <[email protected]>
---
Documentation/filesystems/porting | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting
index c68ea9294b5f..77e2d623e115 100644
--- a/Documentation/filesystems/porting
+++ b/Documentation/filesystems/porting
@@ -610,3 +610,15 @@ in your dentry operations instead.
value for 'called finish_no_open(), open it yourself' case has become
0, not 1. Since finish_no_open() itself is returning 0 now, that part
does not need any changes in ->atomic_open() instances.
+--
+[mandatory]
+ alloc_file() has become static now; two wrappers are to be used instead.
+ alloc_file_pseudo(inode, vfsmount, name, mode, ops) is for the cases
+ when dentry needs to be created; that's the majority of old alloc_file()
+ users. Calling conventions: on success a reference to new struct file
+ is returned and callers reference to inode is subsumed by that. On
+ failure, ERR_PTR() is returned and no caller's references are affected,
+ so the caller needs to drop the inode reference it held.
+ alloc_file_clone(file, mode, ops) does not affect any caller's references.
+ On success you get a new struct file sharing the mount/dentry with the
+ original, on failure - ERR_PTR().
--
2.11.0


2018-07-11 02:25:03

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 31/42] cxl_getfile(): switch to alloc_file_pseudo()

From: Al Viro <[email protected]>

Signed-off-by: Al Viro <[email protected]>
---
drivers/misc/cxl/api.c | 21 ++++-----------------
1 file changed, 4 insertions(+), 17 deletions(-)

diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c
index 6b16946f9b05..e0b9c00aecde 100644
--- a/drivers/misc/cxl/api.c
+++ b/drivers/misc/cxl/api.c
@@ -67,10 +67,8 @@ static struct file *cxl_getfile(const char *name,
const struct file_operations *fops,
void *priv, int flags)
{
- struct qstr this;
- struct path path;
struct file *file;
- struct inode *inode = NULL;
+ struct inode *inode;
int rc;

/* strongly inspired by anon_inode_getfile() */
@@ -91,22 +89,11 @@ static struct file *cxl_getfile(const char *name,
goto err_fs;
}

- file = ERR_PTR(-ENOMEM);
- this.name = name;
- this.len = strlen(name);
- this.hash = 0;
- path.dentry = d_alloc_pseudo(cxl_vfs_mount->mnt_sb, &this);
- if (!path.dentry)
+ file = alloc_file_pseudo(inode, cxl_vfs_mount, name,
+ OPEN_FMODE(flags), fops);
+ if (IS_ERR(file))
goto err_inode;

- path.mnt = mntget(cxl_vfs_mount);
- d_instantiate(path.dentry, inode);
-
- file = alloc_file(&path, OPEN_FMODE(flags), fops);
- if (IS_ERR(file)) {
- path_put(&path);
- goto err_fs;
- }
file->f_flags = flags & (O_ACCMODE | O_NONBLOCK);
file->private_data = priv;

--
2.11.0


2018-07-11 02:25:06

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 34/42] anon_inode_getfile(): switch to alloc_file_pseudo()

From: Al Viro <[email protected]>

Signed-off-by: Al Viro <[email protected]>
---
fs/anon_inodes.c | 29 ++++++-----------------------
1 file changed, 6 insertions(+), 23 deletions(-)

diff --git a/fs/anon_inodes.c b/fs/anon_inodes.c
index 3168ee4e77f4..7e13edd23db1 100644
--- a/fs/anon_inodes.c
+++ b/fs/anon_inodes.c
@@ -71,8 +71,6 @@ struct file *anon_inode_getfile(const char *name,
const struct file_operations *fops,
void *priv, int flags)
{
- struct qstr this;
- struct path path;
struct file *file;

if (IS_ERR(anon_inode_inode))
@@ -82,29 +80,15 @@ struct file *anon_inode_getfile(const char *name,
return ERR_PTR(-ENOENT);

/*
- * Link the inode to a directory entry by creating a unique name
- * using the inode sequence number.
- */
- file = ERR_PTR(-ENOMEM);
- this.name = name;
- this.len = strlen(name);
- this.hash = 0;
- path.dentry = d_alloc_pseudo(anon_inode_mnt->mnt_sb, &this);
- if (!path.dentry)
- goto err_module;
-
- path.mnt = mntget(anon_inode_mnt);
- /*
* We know the anon_inode inode count is always greater than zero,
* so ihold() is safe.
*/
ihold(anon_inode_inode);
-
- d_instantiate(path.dentry, anon_inode_inode);
-
- file = alloc_file(&path, OPEN_FMODE(flags), fops);
+ file = alloc_file_pseudo(anon_inode_inode, anon_inode_mnt, name,
+ OPEN_FMODE(flags), fops);
if (IS_ERR(file))
- goto err_dput;
+ goto err;
+
file->f_mapping = anon_inode_inode->i_mapping;

file->f_flags = flags & (O_ACCMODE | O_NONBLOCK);
@@ -112,9 +96,8 @@ struct file *anon_inode_getfile(const char *name,

return file;

-err_dput:
- path_put(&path);
-err_module:
+err:
+ iput(anon_inode_inode);
module_put(fops->owner);
return file;
}
--
2.11.0


2018-07-11 02:25:06

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 33/42] hugetlb_file_setup(): switch to alloc_file_pseudo()

From: Al Viro <[email protected]>

Signed-off-by: Al Viro <[email protected]>
---
fs/hugetlbfs/inode.c | 55 ++++++++++++++++------------------------------------
1 file changed, 17 insertions(+), 38 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index d508c7844681..86ffe04f73d6 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -1308,10 +1308,6 @@ static int get_hstate_idx(int page_size_log)
return h - hstates;
}

-static const struct dentry_operations anon_ops = {
- .d_dname = simple_dname
-};
-
/*
* Note that size should be aligned to proper hugepage size in caller side,
* otherwise hugetlb_reserve_pages reserves one less hugepages than intended.
@@ -1320,19 +1316,18 @@ struct file *hugetlb_file_setup(const char *name, size_t size,
vm_flags_t acctflag, struct user_struct **user,
int creat_flags, int page_size_log)
{
- struct file *file = ERR_PTR(-ENOMEM);
struct inode *inode;
- struct path path;
- struct super_block *sb;
- struct qstr quick_string;
+ struct vfsmount *mnt;
int hstate_idx;
+ struct file *file;

hstate_idx = get_hstate_idx(page_size_log);
if (hstate_idx < 0)
return ERR_PTR(-ENODEV);

*user = NULL;
- if (!hugetlbfs_vfsmount[hstate_idx])
+ mnt = hugetlbfs_vfsmount[hstate_idx];
+ if (!mnt)
return ERR_PTR(-ENOENT);

if (creat_flags == HUGETLB_SHMFS_INODE && !can_do_hugetlb_shm()) {
@@ -1348,45 +1343,29 @@ struct file *hugetlb_file_setup(const char *name, size_t size,
}
}

- sb = hugetlbfs_vfsmount[hstate_idx]->mnt_sb;
- quick_string.name = name;
- quick_string.len = strlen(quick_string.name);
- quick_string.hash = 0;
- path.dentry = d_alloc_pseudo(sb, &quick_string);
- if (!path.dentry)
- goto out_shm_unlock;
-
- d_set_d_op(path.dentry, &anon_ops);
- path.mnt = mntget(hugetlbfs_vfsmount[hstate_idx]);
file = ERR_PTR(-ENOSPC);
- inode = hugetlbfs_get_inode(sb, NULL, S_IFREG | S_IRWXUGO, 0);
+ inode = hugetlbfs_get_inode(mnt->mnt_sb, NULL, S_IFREG | S_IRWXUGO, 0);
if (!inode)
- goto out_dentry;
+ goto out;
if (creat_flags == HUGETLB_SHMFS_INODE)
inode->i_flags |= S_PRIVATE;

- file = ERR_PTR(-ENOMEM);
- if (hugetlb_reserve_pages(inode, 0,
- size >> huge_page_shift(hstate_inode(inode)), NULL,
- acctflag))
- goto out_inode;
-
- d_instantiate(path.dentry, inode);
inode->i_size = size;
clear_nlink(inode);

- file = alloc_file(&path, FMODE_WRITE | FMODE_READ,
- &hugetlbfs_file_operations);
- if (IS_ERR(file))
- goto out_dentry; /* inode is already attached */
-
- return file;
+ if (hugetlb_reserve_pages(inode, 0,
+ size >> huge_page_shift(hstate_inode(inode)), NULL,
+ acctflag))
+ file = ERR_PTR(-ENOMEM);
+ else
+ file = alloc_file_pseudo(inode, mnt, name,
+ FMODE_WRITE | FMODE_READ,
+ &hugetlbfs_file_operations);
+ if (!IS_ERR(file))
+ return file;

-out_inode:
iput(inode);
-out_dentry:
- path_put(&path);
-out_shm_unlock:
+out:
if (*user) {
user_shm_unlock(size, *user);
*user = NULL;
--
2.11.0


2018-07-11 02:25:23

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 35/42] create_pipe_files(): switch the first allocation to alloc_file_pseudo()

From: Al Viro <[email protected]>

Signed-off-by: Al Viro <[email protected]>
---
fs/pipe.c | 31 ++++++-------------------------
1 file changed, 6 insertions(+), 25 deletions(-)

diff --git a/fs/pipe.c b/fs/pipe.c
index 9405e455f5b1..c8c4265bf87b 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -744,53 +744,34 @@ static struct inode * get_pipe_inode(void)

int create_pipe_files(struct file **res, int flags)
{
- int err;
struct inode *inode = get_pipe_inode();
struct file *f;
- struct path path;

if (!inode)
return -ENFILE;

- err = -ENOMEM;
- path.dentry = d_alloc_pseudo(pipe_mnt->mnt_sb, &empty_name);
- if (!path.dentry)
- goto err_inode;
- path.mnt = mntget(pipe_mnt);
-
- d_instantiate(path.dentry, inode);
-
- f = alloc_file(&path, FMODE_WRITE, &pipefifo_fops);
+ f = alloc_file_pseudo(inode, pipe_mnt, "", FMODE_WRITE, &pipefifo_fops);
if (IS_ERR(f)) {
- err = PTR_ERR(f);
- goto err_dentry;
+ free_pipe_info(inode->i_pipe);
+ iput(inode);
+ return PTR_ERR(f);
}

f->f_flags = O_WRONLY | (flags & (O_NONBLOCK | O_DIRECT));
f->private_data = inode->i_pipe;

- res[0] = alloc_file(&path, FMODE_READ, &pipefifo_fops);
+ res[0] = alloc_file(&f->f_path, FMODE_READ, &pipefifo_fops);
if (IS_ERR(res[0])) {
put_pipe_info(inode, inode->i_pipe);
fput(f);
return PTR_ERR(res[0]);
}

- path_get(&path);
+ path_get(&f->f_path);
res[0]->private_data = inode->i_pipe;
res[0]->f_flags = O_RDONLY | (flags & O_NONBLOCK);
res[1] = f;
return 0;
-
-err_dentry:
- free_pipe_info(inode->i_pipe);
- path_put(&path);
- return err;
-
-err_inode:
- free_pipe_info(inode->i_pipe);
- iput(inode);
- return err;
}

static int __do_pipe_flags(int *fd, struct file **files, int flags)
--
2.11.0


2018-07-11 02:25:31

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 30/42] ... and switch shmem_file_setup() to alloc_file_pseudo()

From: Al Viro <[email protected]>

Signed-off-by: Al Viro <[email protected]>
---
mm/shmem.c | 39 ++++++++-------------------------------
1 file changed, 8 insertions(+), 31 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 1146a48926bc..fd21df189f32 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3896,18 +3896,11 @@ EXPORT_SYMBOL_GPL(shmem_truncate_range);

/* common code */

-static const struct dentry_operations anon_ops = {
- .d_dname = simple_dname
-};
-
static struct file *__shmem_file_setup(struct vfsmount *mnt, const char *name, loff_t size,
unsigned long flags, unsigned int i_flags)
{
- struct file *res;
struct inode *inode;
- struct path path;
- struct super_block *sb;
- struct qstr this;
+ struct file *res;

if (IS_ERR(mnt))
return ERR_CAST(mnt);
@@ -3918,8 +3911,8 @@ static struct file *__shmem_file_setup(struct vfsmount *mnt, const char *name, l
if (shmem_acct_size(flags, size))
return ERR_PTR(-ENOMEM);

- sb = mnt->mnt_sb;
- inode = shmem_get_inode(sb, NULL, S_IFREG | S_IRWXUGO, 0, flags);
+ inode = shmem_get_inode(mnt->mnt_sb, NULL, S_IFREG | S_IRWXUGO, 0,
+ flags);
if (unlikely(!inode)) {
shmem_unacct_size(flags, size);
return ERR_PTR(-ENOSPC);
@@ -3928,28 +3921,12 @@ static struct file *__shmem_file_setup(struct vfsmount *mnt, const char *name, l
inode->i_size = size;
clear_nlink(inode); /* It is unlinked */
res = ERR_PTR(ramfs_nommu_expand_for_mapping(inode, size));
- if (IS_ERR(res)) {
- iput(inode);
- return res;
- }
-
- this.name = name;
- this.len = strlen(name);
- this.hash = 0; /* will go */
- path.mnt = mntget(mnt);
- path.dentry = d_alloc_pseudo(sb, &this);
- if (!path.dentry) {
- iput(inode);
- return ERR_PTR(-ENOMEM);
- }
- d_set_d_op(path.dentry, &anon_ops);
-
- d_instantiate(path.dentry, inode);
-
- res = alloc_file(&path, FMODE_WRITE | FMODE_READ,
- &shmem_file_operations);
+ if (!IS_ERR(res))
+ res = alloc_file_pseudo(inode, mnt, name,
+ FMODE_WRITE | FMODE_READ,
+ &shmem_file_operations);
if (IS_ERR(res))
- path_put(&path);
+ iput(inode);
return res;
}

--
2.11.0


2018-07-11 02:25:32

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 37/42] do_shmat(): grab shp->shm_file earlier, switch to alloc_file_clone()

From: Al Viro <[email protected]>

Signed-off-by: Al Viro <[email protected]>
---
ipc/shm.c | 39 ++++++++++++++++++---------------------
1 file changed, 18 insertions(+), 21 deletions(-)

diff --git a/ipc/shm.c b/ipc/shm.c
index 051a3e1fb8df..0cebcf74b669 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -1354,14 +1354,13 @@ long do_shmat(int shmid, char __user *shmaddr, int shmflg,
struct shmid_kernel *shp;
unsigned long addr = (unsigned long)shmaddr;
unsigned long size;
- struct file *file;
+ struct file *file, *base;
int err;
unsigned long flags = MAP_SHARED;
unsigned long prot;
int acc_mode;
struct ipc_namespace *ns;
struct shm_file_data *sfd;
- struct path path;
fmode_t f_mode;
unsigned long populate = 0;

@@ -1435,46 +1434,44 @@ long do_shmat(int shmid, char __user *shmaddr, int shmflg,
goto out_unlock;
}

- path = shp->shm_file->f_path;
- path_get(&path);
+ /*
+ * We need to take a reference to the real shm file to prevent the
+ * pointer from becoming stale in cases where the lifetime of the outer
+ * file extends beyond that of the shm segment. It's not usually
+ * possible, but it can happen during remap_file_pages() emulation as
+ * that unmaps the memory, then does ->mmap() via file reference only.
+ * We'll deny the ->mmap() if the shm segment was since removed, but to
+ * detect shm ID reuse we need to compare the file pointers.
+ */
+ base = get_file(shp->shm_file);
shp->shm_nattch++;
- size = i_size_read(d_inode(path.dentry));
+ size = i_size_read(file_inode(base));
ipc_unlock_object(&shp->shm_perm);
rcu_read_unlock();

err = -ENOMEM;
sfd = kzalloc(sizeof(*sfd), GFP_KERNEL);
if (!sfd) {
- path_put(&path);
+ fput(base);
goto out_nattch;
}

- file = alloc_file(&path, f_mode,
- is_file_hugepages(shp->shm_file) ?
+ file = alloc_file_clone(base, f_mode,
+ is_file_hugepages(base) ?
&shm_file_operations_huge :
&shm_file_operations);
err = PTR_ERR(file);
if (IS_ERR(file)) {
kfree(sfd);
- path_put(&path);
+ fput(base);
goto out_nattch;
}

- file->private_data = sfd;
- file->f_mapping = shp->shm_file->f_mapping;
sfd->id = shp->shm_perm.id;
sfd->ns = get_ipc_ns(ns);
- /*
- * We need to take a reference to the real shm file to prevent the
- * pointer from becoming stale in cases where the lifetime of the outer
- * file extends beyond that of the shm segment. It's not usually
- * possible, but it can happen during remap_file_pages() emulation as
- * that unmaps the memory, then does ->mmap() via file reference only.
- * We'll deny the ->mmap() if the shm segment was since removed, but to
- * detect shm ID reuse we need to compare the file pointers.
- */
- sfd->file = get_file(shp->shm_file);
+ sfd->file = base;
sfd->vm_ops = NULL;
+ file->private_data = sfd;

err = security_mmap_file(file, prot, flags);
if (err)
--
2.11.0


2018-07-11 02:25:35

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 26/42] switch atomic_open() and lookup_open() to returning 0 in all success cases

From: Al Viro <[email protected]>

caller can tell "opened" from "open it yourself" by looking at ->f_mode.

Signed-off-by: Al Viro <[email protected]>
---
fs/namei.c | 34 +++++++++++++++++-----------------
1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 8b1e5cb85f58..ca3f4dec8cda 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3027,9 +3027,9 @@ static int may_o_create(const struct path *dir, struct dentry *dentry, umode_t m
* Returns 0 if successful. The file will have been created and attached to
* @file by the filesystem calling finish_open().
*
- * Returns 1 if the file was looked up only or didn't need creating. The
- * caller will need to perform the open themselves. @path will have been
- * updated to point to the new dentry. This may be negative.
+ * If the file was looked up only or didn't need creating, FMODE_OPENED won't
+ * be set. The caller will need to perform the open themselves. @path will
+ * have been updated to point to the new dentry. This may be negative.
*
* Returns an error code otherwise.
*/
@@ -3082,7 +3082,7 @@ static int atomic_open(struct nameidata *nd, struct dentry *dentry,
} else {
path->dentry = dentry;
path->mnt = nd->path.mnt;
- return 1;
+ return 0;
}
}
}
@@ -3093,17 +3093,17 @@ static int atomic_open(struct nameidata *nd, struct dentry *dentry,
/*
* Look up and maybe create and open the last component.
*
- * Must be called with i_mutex held on parent.
- *
- * Returns 0 if the file was successfully atomically created (if necessary) and
- * opened. In this case the file will be returned attached to @file.
+ * Must be called with parent locked (exclusive in O_CREAT case).
*
- * Returns 1 if the file was not completely opened at this time, though lookups
- * and creations will have been performed and the dentry returned in @path will
- * be positive upon return if O_CREAT was specified. If O_CREAT wasn't
- * specified then a negative dentry may be returned.
+ * Returns 0 on success, that is, if
+ * the file was successfully atomically created (if necessary) and opened, or
+ * the file was not completely opened at this time, though lookups and
+ * creations were performed.
+ * These case are distinguished by presence of FMODE_OPENED on file->f_mode.
+ * In the latter case dentry returned in @path might be negative if O_CREAT
+ * hadn't been specified.
*
- * An error code is returned otherwise.
+ * An error code is returned on failure.
*/
static int lookup_open(struct nameidata *nd, struct path *path,
struct file *file,
@@ -3225,7 +3225,7 @@ static int lookup_open(struct nameidata *nd, struct path *path,
out_no_open:
path->dentry = dentry;
path->mnt = nd->path.mnt;
- return 1;
+ return 0;

out_dput:
dput(dentry);
@@ -3308,10 +3308,10 @@ static int do_last(struct nameidata *nd,
else
inode_unlock_shared(dir->d_inode);

- if (error <= 0) {
- if (error)
- goto out;
+ if (error)
+ goto out;

+ if (file->f_mode & FMODE_OPENED) {
if ((file->f_mode & FMODE_CREATED) ||
!S_ISREG(file_inode(file)->i_mode))
will_truncate = false;
--
2.11.0


2018-07-11 02:25:45

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 25/42] document ->atomic_open() changes

From: Al Viro <[email protected]>

Signed-off-by: Al Viro <[email protected]>
---
Documentation/filesystems/Locking | 2 +-
Documentation/filesystems/porting | 8 ++++++++
Documentation/filesystems/vfs.txt | 18 ++++++++++--------
3 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index 2c391338c675..04c867981375 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -64,7 +64,7 @@ prototypes:
void (*update_time)(struct inode *, struct timespec *, int);
int (*atomic_open)(struct inode *, struct dentry *,
struct file *, unsigned open_flag,
- umode_t create_mode, int *opened);
+ umode_t create_mode);
int (*tmpfile) (struct inode *, struct dentry *, umode_t);

locking rules:
diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting
index 17bb4dc28fae..c68ea9294b5f 100644
--- a/Documentation/filesystems/porting
+++ b/Documentation/filesystems/porting
@@ -602,3 +602,11 @@ in your dentry operations instead.
dentry separately, and it now has request_mask and query_flags arguments
to specify the fields and sync type requested by statx. Filesystems not
supporting any statx-specific features may ignore the new arguments.
+--
+[mandatory]
+ ->atomic_open() calling conventions have changed. Gone is int *opened,
+ along with FILE_OPENED/FILE_CREATED. In place of those we have
+ FMODE_OPENED/FMODE_CREATED, set in file->f_mode. Additionally, return
+ value for 'called finish_no_open(), open it yourself' case has become
+ 0, not 1. Since finish_no_open() itself is returning 0 now, that part
+ does not need any changes in ->atomic_open() instances.
diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 829a7b7857a4..d564cc44397e 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -386,7 +386,7 @@ struct inode_operations {
ssize_t (*listxattr) (struct dentry *, char *, size_t);
void (*update_time)(struct inode *, struct timespec *, int);
int (*atomic_open)(struct inode *, struct dentry *, struct file *,
- unsigned open_flag, umode_t create_mode, int *opened);
+ unsigned open_flag, umode_t create_mode);
int (*tmpfile) (struct inode *, struct dentry *, umode_t);
};

@@ -496,13 +496,15 @@ otherwise noted.

atomic_open: called on the last component of an open. Using this optional
method the filesystem can look up, possibly create and open the file in
- one atomic operation. If it cannot perform this (e.g. the file type
- turned out to be wrong) it may signal this by returning 1 instead of
- usual 0 or -ve . This method is only called if the last component is
- negative or needs lookup. Cached positive dentries are still handled by
- f_op->open(). If the file was created, the FILE_CREATED flag should be
- set in "opened". In case of O_EXCL the method must only succeed if the
- file didn't exist and hence FILE_CREATED shall always be set on success.
+ one atomic operation. If it wants to leave actual opening to the
+ caller (e.g. if the file turned out to be a symlink, device, or just
+ something filesystem won't do atomic open for), it may signal this by
+ returning finish_no_open(file, dentry). This method is only called if
+ the last component is negative or needs lookup. Cached positive dentries
+ are still handled by f_op->open(). If the file was created,
+ FMODE_CREATED flag should be set in file->f_mode. In case of O_EXCL
+ the method must only succeed if the file didn't exist and hence FMODE_CREATED
+ shall always be set on success.

tmpfile: called in the end of O_TMPFILE open(). Optional, equivalent to
atomically creating, opening and unlinking a file in given directory.
--
2.11.0


2018-07-11 02:25:48

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 28/42] new wrapper: alloc_file_pseudo()

From: Al Viro <[email protected]>

takes inode, vfsmount, name, mode and file_operations and either
returns a new struct file (in which case inode reference we held
is consumed) or returns ERR_PTR(), in which case no refcounts are
altered.

converted aio_private_file() and sock_alloc_file() to it

Signed-off-by: Al Viro <[email protected]>
---
fs/aio.c | 26 ++++++--------------------
fs/file_table.c | 27 +++++++++++++++++++++++++++
include/linux/file.h | 3 +++
net/socket.c | 27 ++++-----------------------
4 files changed, 40 insertions(+), 43 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index e1d20124ec0e..c5244c68f90e 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -215,9 +215,7 @@ static const struct address_space_operations aio_ctx_aops;

static struct file *aio_private_file(struct kioctx *ctx, loff_t nr_pages)
{
- struct qstr this = QSTR_INIT("[aio]", 5);
struct file *file;
- struct path path;
struct inode *inode = alloc_anon_inode(aio_mnt->mnt_sb);
if (IS_ERR(inode))
return ERR_CAST(inode);
@@ -226,31 +224,19 @@ static struct file *aio_private_file(struct kioctx *ctx, loff_t nr_pages)
inode->i_mapping->private_data = ctx;
inode->i_size = PAGE_SIZE * nr_pages;

- path.dentry = d_alloc_pseudo(aio_mnt->mnt_sb, &this);
- if (!path.dentry) {
+ file = alloc_file_pseudo(inode, aio_mnt, "[aio]",
+ FMODE_READ | FMODE_WRITE, &aio_ring_fops);
+ if (IS_ERR(file))
iput(inode);
- return ERR_PTR(-ENOMEM);
- }
- path.mnt = mntget(aio_mnt);
-
- d_instantiate(path.dentry, inode);
- file = alloc_file(&path, FMODE_READ | FMODE_WRITE, &aio_ring_fops);
- if (IS_ERR(file)) {
- path_put(&path);
- return file;
- }
-
- file->f_flags = O_RDWR;
+ else
+ file->f_flags = O_RDWR;
return file;
}

static struct dentry *aio_mount(struct file_system_type *fs_type,
int flags, const char *dev_name, void *data)
{
- static const struct dentry_operations ops = {
- .d_dname = simple_dname,
- };
- struct dentry *root = mount_pseudo(fs_type, "aio:", NULL, &ops,
+ struct dentry *root = mount_pseudo(fs_type, "aio:", NULL, NULL,
AIO_RING_MAGIC);

if (!IS_ERR(root))
diff --git a/fs/file_table.c b/fs/file_table.c
index d55dc579ae4e..23bc46ea5f4b 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -182,6 +182,33 @@ struct file *alloc_file(const struct path *path, fmode_t mode,
}
EXPORT_SYMBOL(alloc_file);

+struct file *alloc_file_pseudo(struct inode *inode, struct vfsmount *mnt,
+ const char *name, fmode_t mode,
+ const struct file_operations *fops)
+{
+ static const struct dentry_operations anon_ops = {
+ .d_dname = simple_dname
+ };
+ struct qstr this = QSTR_INIT(name, strlen(name));
+ struct path path;
+ struct file *file;
+
+ path.dentry = d_alloc_pseudo(mnt->mnt_sb, &this);
+ if (!path.dentry)
+ return ERR_PTR(-ENOMEM);
+ if (!mnt->mnt_sb->s_d_op)
+ d_set_d_op(path.dentry, &anon_ops);
+ path.mnt = mntget(mnt);
+ d_instantiate(path.dentry, inode);
+ file = alloc_file(&path, mode, fops);
+ if (IS_ERR(file)) {
+ ihold(inode);
+ path_put(&path);
+ }
+ return file;
+}
+EXPORT_SYMBOL(alloc_file_pseudo);
+
/* the real guts of fput() - releasing the last reference to file
*/
static void __fput(struct file *file)
diff --git a/include/linux/file.h b/include/linux/file.h
index d771bc05da77..0a4ef2315f75 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -17,9 +17,12 @@ extern void fput(struct file *);
struct file_operations;
struct vfsmount;
struct dentry;
+struct inode;
struct path;
extern struct file *alloc_file(const struct path *, fmode_t mode,
const struct file_operations *fop);
+extern struct file *alloc_file_pseudo(struct inode *, struct vfsmount *,
+ const char *, fmode_t, const struct file_operations *);

static inline void fput_light(struct file *file, int fput_needed)
{
diff --git a/net/socket.c b/net/socket.c
index 8a109012608a..81cf9906cae5 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -391,33 +391,14 @@ static struct file_system_type sock_fs_type = {

struct file *sock_alloc_file(struct socket *sock, int flags, const char *dname)
{
- struct qstr name = { .name = "" };
- struct path path;
struct file *file;

- if (dname) {
- name.name = dname;
- name.len = strlen(name.name);
- } else if (sock->sk) {
- name.name = sock->sk->sk_prot_creator->name;
- name.len = strlen(name.name);
- }
- path.dentry = d_alloc_pseudo(sock_mnt->mnt_sb, &name);
- if (unlikely(!path.dentry)) {
- sock_release(sock);
- return ERR_PTR(-ENOMEM);
- }
- path.mnt = mntget(sock_mnt);
-
- d_instantiate(path.dentry, SOCK_INODE(sock));
+ if (!dname)
+ dname = sock->sk ? sock->sk->sk_prot_creator->name : "";

- file = alloc_file(&path, FMODE_READ | FMODE_WRITE,
- &socket_file_ops);
+ file = alloc_file_pseudo(SOCK_INODE(sock), sock_mnt, dname,
+ FMODE_READ | FMODE_WRITE, &socket_file_ops);
if (IS_ERR(file)) {
- /* drop dentry, keep inode for a bit */
- ihold(d_inode(path.dentry));
- path_put(&path);
- /* ... and now kill it properly */
sock_release(sock);
return file;
}
--
2.11.0


2018-07-11 02:25:57

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 23/42] get rid of 'opened' in path_openat() and the helpers downstream

From: Al Viro <[email protected]>

unused now

Signed-off-by: Al Viro <[email protected]>
---
fs/namei.c | 22 ++++++++--------------
1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 96daa644e9e3..4ce780a8ebbc 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3036,8 +3036,7 @@ static int may_o_create(const struct path *dir, struct dentry *dentry, umode_t m
static int atomic_open(struct nameidata *nd, struct dentry *dentry,
struct path *path, struct file *file,
const struct open_flags *op,
- int open_flag, umode_t mode,
- int *opened)
+ int open_flag, umode_t mode)
{
struct dentry *const DENTRY_NOT_SET = (void *) -1UL;
struct inode *dir = nd->path.dentry->d_inode;
@@ -3105,14 +3104,11 @@ static int atomic_open(struct nameidata *nd, struct dentry *dentry,
* specified then a negative dentry may be returned.
*
* An error code is returned otherwise.
- *
- * FILE_CREATE will be set in @*opened if the dentry was created and will be
- * cleared otherwise prior to returning.
*/
static int lookup_open(struct nameidata *nd, struct path *path,
struct file *file,
const struct open_flags *op,
- bool got_write, int *opened)
+ bool got_write)
{
struct dentry *dir = nd->path.dentry;
struct inode *dir_inode = dir->d_inode;
@@ -3187,7 +3183,7 @@ static int lookup_open(struct nameidata *nd, struct path *path,

if (dir_inode->i_op->atomic_open) {
error = atomic_open(nd, dentry, path, file, op, open_flag,
- mode, opened);
+ mode);
if (unlikely(error == -ENOENT) && create_error)
error = create_error;
return error;
@@ -3240,8 +3236,7 @@ static int lookup_open(struct nameidata *nd, struct path *path,
* Handle the last step of open()
*/
static int do_last(struct nameidata *nd,
- struct file *file, const struct open_flags *op,
- int *opened)
+ struct file *file, const struct open_flags *op)
{
struct dentry *dir = nd->path.dentry;
int open_flag = op->open_flag;
@@ -3307,7 +3302,7 @@ static int do_last(struct nameidata *nd,
inode_lock(dir->d_inode);
else
inode_lock_shared(dir->d_inode);
- error = lookup_open(nd, &path, file, op, got_write, opened);
+ error = lookup_open(nd, &path, file, op, got_write);
if (open_flag & O_CREAT)
inode_unlock(dir->d_inode);
else
@@ -3452,7 +3447,7 @@ EXPORT_SYMBOL(vfs_tmpfile);

static int do_tmpfile(struct nameidata *nd, unsigned flags,
const struct open_flags *op,
- struct file *file, int *opened)
+ struct file *file)
{
struct dentry *child;
struct path path;
@@ -3499,7 +3494,6 @@ static struct file *path_openat(struct nameidata *nd,
{
const char *s;
struct file *file;
- int opened = 0;
int error;

file = alloc_empty_file(current_cred());
@@ -3509,7 +3503,7 @@ static struct file *path_openat(struct nameidata *nd,
file->f_flags = op->open_flag;

if (unlikely(file->f_flags & __O_TMPFILE)) {
- error = do_tmpfile(nd, flags, op, file, &opened);
+ error = do_tmpfile(nd, flags, op, file);
goto out2;
}

@@ -3524,7 +3518,7 @@ static struct file *path_openat(struct nameidata *nd,
return ERR_CAST(s);
}
while (!(error = link_path_walk(s, nd)) &&
- (error = do_last(nd, file, op, &opened)) > 0) {
+ (error = do_last(nd, file, op)) > 0) {
nd->flags &= ~(LOOKUP_OPEN|LOOKUP_CREATE|LOOKUP_EXCL);
s = trailing_symlink(nd);
if (IS_ERR(s)) {
--
2.11.0


2018-07-11 02:26:00

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 29/42] __shmem_file_setup(): reorder allocations

From: Al Viro <[email protected]>

grab inode and reserve memory first.

Signed-off-by: Al Viro <[email protected]>
---
mm/shmem.c | 43 ++++++++++++++++++++-----------------------
1 file changed, 20 insertions(+), 23 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 2cab84403055..1146a48926bc 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3918,41 +3918,38 @@ static struct file *__shmem_file_setup(struct vfsmount *mnt, const char *name, l
if (shmem_acct_size(flags, size))
return ERR_PTR(-ENOMEM);

- res = ERR_PTR(-ENOMEM);
+ sb = mnt->mnt_sb;
+ inode = shmem_get_inode(sb, NULL, S_IFREG | S_IRWXUGO, 0, flags);
+ if (unlikely(!inode)) {
+ shmem_unacct_size(flags, size);
+ return ERR_PTR(-ENOSPC);
+ }
+ inode->i_flags |= i_flags;
+ inode->i_size = size;
+ clear_nlink(inode); /* It is unlinked */
+ res = ERR_PTR(ramfs_nommu_expand_for_mapping(inode, size));
+ if (IS_ERR(res)) {
+ iput(inode);
+ return res;
+ }
+
this.name = name;
this.len = strlen(name);
this.hash = 0; /* will go */
- sb = mnt->mnt_sb;
path.mnt = mntget(mnt);
path.dentry = d_alloc_pseudo(sb, &this);
- if (!path.dentry)
- goto put_memory;
+ if (!path.dentry) {
+ iput(inode);
+ return ERR_PTR(-ENOMEM);
+ }
d_set_d_op(path.dentry, &anon_ops);

- res = ERR_PTR(-ENOSPC);
- inode = shmem_get_inode(sb, NULL, S_IFREG | 0777, 0, flags);
- if (!inode)
- goto put_memory;
-
- inode->i_flags |= i_flags;
d_instantiate(path.dentry, inode);
- inode->i_size = size;
- clear_nlink(inode); /* It is unlinked */
- res = ERR_PTR(ramfs_nommu_expand_for_mapping(inode, size));
- if (IS_ERR(res))
- goto put_path;

res = alloc_file(&path, FMODE_WRITE | FMODE_READ,
&shmem_file_operations);
if (IS_ERR(res))
- goto put_path;
-
- return res;
-
-put_memory:
- shmem_unacct_size(flags, size);
-put_path:
- path_put(&path);
+ path_put(&path);
return res;
}

--
2.11.0


2018-07-11 02:26:04

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 22/42] get rid of 'opened' argument of ->atomic_open() - part 3

From: Al Viro <[email protected]>

now it can be done...

Signed-off-by: Al Viro <[email protected]>
---
fs/9p/vfs_inode.c | 3 +--
fs/9p/vfs_inode_dotl.c | 3 +--
fs/bad_inode.c | 2 +-
fs/ceph/file.c | 3 +--
fs/ceph/super.h | 3 +--
fs/cifs/cifsfs.h | 3 +--
fs/cifs/dir.c | 3 +--
fs/fuse/dir.c | 2 +-
fs/gfs2/inode.c | 3 +--
fs/namei.c | 3 +--
fs/nfs/dir.c | 2 +-
fs/nfs/nfs4_fs.h | 2 +-
include/linux/fs.h | 2 +-
13 files changed, 13 insertions(+), 21 deletions(-)

diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 7b6ff3275d9c..85ff859d3af5 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -859,8 +859,7 @@ struct dentry *v9fs_vfs_lookup(struct inode *dir, struct dentry *dentry,

static int
v9fs_vfs_atomic_open(struct inode *dir, struct dentry *dentry,
- struct file *file, unsigned flags, umode_t mode,
- int *opened)
+ struct file *file, unsigned flags, umode_t mode)
{
int err;
u32 perm;
diff --git a/fs/9p/vfs_inode_dotl.c b/fs/9p/vfs_inode_dotl.c
index c6939b7cb18c..4823e1c46999 100644
--- a/fs/9p/vfs_inode_dotl.c
+++ b/fs/9p/vfs_inode_dotl.c
@@ -241,8 +241,7 @@ v9fs_vfs_create_dotl(struct inode *dir, struct dentry *dentry, umode_t omode,

static int
v9fs_vfs_atomic_open_dotl(struct inode *dir, struct dentry *dentry,
- struct file *file, unsigned flags, umode_t omode,
- int *opened)
+ struct file *file, unsigned flags, umode_t omode)
{
int err = 0;
kgid_t gid;
diff --git a/fs/bad_inode.c b/fs/bad_inode.c
index 125e8bbd22a2..8035d2a44561 100644
--- a/fs/bad_inode.c
+++ b/fs/bad_inode.c
@@ -134,7 +134,7 @@ static int bad_inode_update_time(struct inode *inode, struct timespec64 *time,

static int bad_inode_atomic_open(struct inode *inode, struct dentry *dentry,
struct file *file, unsigned int open_flag,
- umode_t create_mode, int *opened)
+ umode_t create_mode)
{
return -EIO;
}
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 38b28cb2fac1..e2679e8a2535 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -429,8 +429,7 @@ int ceph_open(struct inode *inode, struct file *file)
* file or symlink, return 1 so the VFS can retry.
*/
int ceph_atomic_open(struct inode *dir, struct dentry *dentry,
- struct file *file, unsigned flags, umode_t mode,
- int *opened)
+ struct file *file, unsigned flags, umode_t mode)
{
struct ceph_fs_client *fsc = ceph_sb_to_client(dir->i_sb);
struct ceph_mds_client *mdsc = fsc->mdsc;
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index a7077a0c989f..971328b99ede 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -1025,8 +1025,7 @@ extern const struct file_operations ceph_file_fops;
extern int ceph_renew_caps(struct inode *inode);
extern int ceph_open(struct inode *inode, struct file *file);
extern int ceph_atomic_open(struct inode *dir, struct dentry *dentry,
- struct file *file, unsigned flags, umode_t mode,
- int *opened);
+ struct file *file, unsigned flags, umode_t mode);
extern int ceph_release(struct inode *inode, struct file *filp);
extern void ceph_fill_inline_data(struct inode *inode, struct page *locked_page,
char *data, size_t len);
diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h
index 5f0231803431..f3a78efc3109 100644
--- a/fs/cifs/cifsfs.h
+++ b/fs/cifs/cifsfs.h
@@ -65,8 +65,7 @@ extern struct inode *cifs_root_iget(struct super_block *);
extern int cifs_create(struct inode *, struct dentry *, umode_t,
bool excl);
extern int cifs_atomic_open(struct inode *, struct dentry *,
- struct file *, unsigned, umode_t,
- int *);
+ struct file *, unsigned, umode_t);
extern struct dentry *cifs_lookup(struct inode *, struct dentry *,
unsigned int);
extern int cifs_unlink(struct inode *dir, struct dentry *dentry);
diff --git a/fs/cifs/dir.c b/fs/cifs/dir.c
index 891bfd62e67a..3713d22b95a7 100644
--- a/fs/cifs/dir.c
+++ b/fs/cifs/dir.c
@@ -465,8 +465,7 @@ cifs_do_create(struct inode *inode, struct dentry *direntry, unsigned int xid,

int
cifs_atomic_open(struct inode *inode, struct dentry *direntry,
- struct file *file, unsigned oflags, umode_t mode,
- int *opened)
+ struct file *file, unsigned oflags, umode_t mode)
{
int rc;
unsigned int xid;
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index b8d7e9d423c8..c979329311c8 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -489,7 +489,7 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
static int fuse_mknod(struct inode *, struct dentry *, umode_t, dev_t);
static int fuse_atomic_open(struct inode *dir, struct dentry *entry,
struct file *file, unsigned flags,
- umode_t mode, int *opened)
+ umode_t mode)
{
int err;
struct fuse_conn *fc = get_fuse_conn(dir);
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 15e2a8a3b917..648f0ca1ad57 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -1228,14 +1228,13 @@ static int gfs2_mknod(struct inode *dir, struct dentry *dentry, umode_t mode,
* @file: The proposed new struct file
* @flags: open flags
* @mode: File mode
- * @opened: Flag to say whether the file has been opened or not
*
* Returns: error code or 0 for success
*/

static int gfs2_atomic_open(struct inode *dir, struct dentry *dentry,
struct file *file, unsigned flags,
- umode_t mode, int *opened)
+ umode_t mode)
{
struct dentry *d;
bool excl = !!(flags & O_EXCL);
diff --git a/fs/namei.c b/fs/namei.c
index cc412d0620a3..96daa644e9e3 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3052,8 +3052,7 @@ static int atomic_open(struct nameidata *nd, struct dentry *dentry,
file->f_path.dentry = DENTRY_NOT_SET;
file->f_path.mnt = nd->path.mnt;
error = dir->i_op->atomic_open(dir, dentry, file,
- open_to_namei_flags(open_flag),
- mode, opened);
+ open_to_namei_flags(open_flag), mode);
d_lookup_done(dentry);
if (!error) {
/*
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 71ae3cc3e53a..f447b1a24350 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1451,7 +1451,7 @@ static int nfs_finish_open(struct nfs_open_context *ctx,

int nfs_atomic_open(struct inode *dir, struct dentry *dentry,
struct file *file, unsigned open_flags,
- umode_t mode, int *opened)
+ umode_t mode)
{
DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq);
struct nfs_open_context *ctx;
diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h
index 137e18abb7e7..51beb6e38c90 100644
--- a/fs/nfs/nfs4_fs.h
+++ b/fs/nfs/nfs4_fs.h
@@ -258,7 +258,7 @@ extern const struct dentry_operations nfs4_dentry_operations;

/* dir.c */
int nfs_atomic_open(struct inode *, struct dentry *, struct file *,
- unsigned, umode_t, int *);
+ unsigned, umode_t);

/* super.c */
extern struct file_system_type nfs4_fs_type;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 70be3e4c26ac..c25896b30e9f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1781,7 +1781,7 @@ struct inode_operations {
int (*update_time)(struct inode *, struct timespec64 *, int);
int (*atomic_open)(struct inode *, struct dentry *,
struct file *, unsigned open_flag,
- umode_t create_mode, int *opened);
+ umode_t create_mode);
int (*tmpfile) (struct inode *, struct dentry *, umode_t);
int (*set_acl)(struct inode *, struct posix_acl *, int);
} ____cacheline_aligned;
--
2.11.0


2018-07-11 02:26:04

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 24/42] ->atomic_open(): return 0 in all success cases

From: Al Viro <[email protected]>

FMODE_OPENED can be used to distingusish "successful open" from the
"called finish_no_open(), do it yourself" cases. Since finish_no_open()
has been adjusted, no changes in the instances were actually needed.
The caller has been adjusted.

Signed-off-by: Al Viro <[email protected]>
---
fs/namei.c | 30 +++++++++++++++---------------
fs/open.c | 4 ++--
2 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 4ce780a8ebbc..8b1e5cb85f58 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3054,21 +3054,21 @@ static int atomic_open(struct nameidata *nd, struct dentry *dentry,
open_to_namei_flags(open_flag), mode);
d_lookup_done(dentry);
if (!error) {
- /*
- * We didn't have the inode before the open, so check open
- * permission here.
- */
- int acc_mode = op->acc_mode;
- if (file->f_mode & FMODE_CREATED) {
- WARN_ON(!(open_flag & O_CREAT));
- fsnotify_create(dir, dentry);
- acc_mode = 0;
- }
- error = may_open(&file->f_path, acc_mode, open_flag);
- if (WARN_ON(error > 0))
- error = -EINVAL;
- } else if (error > 0) {
- if (WARN_ON(file->f_path.dentry == DENTRY_NOT_SET)) {
+ if (file->f_mode & FMODE_OPENED) {
+ /*
+ * We didn't have the inode before the open, so check open
+ * permission here.
+ */
+ int acc_mode = op->acc_mode;
+ if (file->f_mode & FMODE_CREATED) {
+ WARN_ON(!(open_flag & O_CREAT));
+ fsnotify_create(dir, dentry);
+ acc_mode = 0;
+ }
+ error = may_open(&file->f_path, acc_mode, open_flag);
+ if (WARN_ON(error > 0))
+ error = -EINVAL;
+ } else if (WARN_ON(file->f_path.dentry == DENTRY_NOT_SET)) {
error = -EIO;
} else {
if (file->f_path.dentry) {
diff --git a/fs/open.c b/fs/open.c
index a46ba6647562..2649b6695f60 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -866,13 +866,13 @@ EXPORT_SYMBOL(finish_open);
* NB: unlike finish_open() this function does consume the dentry reference and
* the caller need not dput() it.
*
- * Returns "1" which must be the return value of ->atomic_open() after having
+ * Returns "0" which must be the return value of ->atomic_open() after having
* called this function.
*/
int finish_no_open(struct file *file, struct dentry *dentry)
{
file->f_path.dentry = dentry;
- return 1;
+ return 0;
}
EXPORT_SYMBOL(finish_no_open);

--
2.11.0


2018-07-11 02:26:16

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 21/42] getting rid of 'opened' argument of ->atomic_open() - part 2

From: Al Viro <[email protected]>

__gfs2_lookup(), gfs2_create_inode(), nfs_finish_open() and fuse_create_open()
don't need 'opened' anymore. Get rid of that argument in those.

Signed-off-by: Al Viro <[email protected]>
---
fs/fuse/dir.c | 4 ++--
fs/gfs2/inode.c | 19 +++++++++----------
fs/nfs/dir.c | 5 ++---
3 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index a5b1f5ff8cb7..b8d7e9d423c8 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -399,7 +399,7 @@ static struct dentry *fuse_lookup(struct inode *dir, struct dentry *entry,
*/
static int fuse_create_open(struct inode *dir, struct dentry *entry,
struct file *file, unsigned flags,
- umode_t mode, int *opened)
+ umode_t mode)
{
int err;
struct inode *inode;
@@ -513,7 +513,7 @@ static int fuse_atomic_open(struct inode *dir, struct dentry *entry,
if (fc->no_create)
goto mknod;

- err = fuse_create_open(dir, entry, file, flags, mode, opened);
+ err = fuse_create_open(dir, entry, file, flags, mode);
if (err == -ENOSYS) {
fc->no_create = 1;
goto mknod;
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 59f695e96d63..15e2a8a3b917 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -580,7 +580,7 @@ static int gfs2_initxattrs(struct inode *inode, const struct xattr *xattr_array,
static int gfs2_create_inode(struct inode *dir, struct dentry *dentry,
struct file *file,
umode_t mode, dev_t dev, const char *symname,
- unsigned int size, int excl, int *opened)
+ unsigned int size, int excl)
{
const struct qstr *name = &dentry->d_name;
struct posix_acl *default_acl, *acl;
@@ -822,7 +822,7 @@ static int gfs2_create_inode(struct inode *dir, struct dentry *dentry,
static int gfs2_create(struct inode *dir, struct dentry *dentry,
umode_t mode, bool excl)
{
- return gfs2_create_inode(dir, dentry, NULL, S_IFREG | mode, 0, NULL, 0, excl, NULL);
+ return gfs2_create_inode(dir, dentry, NULL, S_IFREG | mode, 0, NULL, 0, excl);
}

/**
@@ -830,14 +830,13 @@ static int gfs2_create(struct inode *dir, struct dentry *dentry,
* @dir: The directory inode
* @dentry: The dentry of the new inode
* @file: File to be opened
- * @opened: atomic_open flags
*
*
* Returns: errno
*/

static struct dentry *__gfs2_lookup(struct inode *dir, struct dentry *dentry,
- struct file *file, int *opened)
+ struct file *file)
{
struct inode *inode;
struct dentry *d;
@@ -879,7 +878,7 @@ static struct dentry *__gfs2_lookup(struct inode *dir, struct dentry *dentry,
static struct dentry *gfs2_lookup(struct inode *dir, struct dentry *dentry,
unsigned flags)
{
- return __gfs2_lookup(dir, dentry, NULL, NULL);
+ return __gfs2_lookup(dir, dentry, NULL);
}

/**
@@ -1189,7 +1188,7 @@ static int gfs2_symlink(struct inode *dir, struct dentry *dentry,
if (size >= gfs2_max_stuffed_size(GFS2_I(dir)))
return -ENAMETOOLONG;

- return gfs2_create_inode(dir, dentry, NULL, S_IFLNK | S_IRWXUGO, 0, symname, size, 0, NULL);
+ return gfs2_create_inode(dir, dentry, NULL, S_IFLNK | S_IRWXUGO, 0, symname, size, 0);
}

/**
@@ -1204,7 +1203,7 @@ static int gfs2_symlink(struct inode *dir, struct dentry *dentry,
static int gfs2_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
{
unsigned dsize = gfs2_max_stuffed_size(GFS2_I(dir));
- return gfs2_create_inode(dir, dentry, NULL, S_IFDIR | mode, 0, NULL, dsize, 0, NULL);
+ return gfs2_create_inode(dir, dentry, NULL, S_IFDIR | mode, 0, NULL, dsize, 0);
}

/**
@@ -1219,7 +1218,7 @@ static int gfs2_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
static int gfs2_mknod(struct inode *dir, struct dentry *dentry, umode_t mode,
dev_t dev)
{
- return gfs2_create_inode(dir, dentry, NULL, mode, dev, NULL, 0, 0, NULL);
+ return gfs2_create_inode(dir, dentry, NULL, mode, dev, NULL, 0, 0);
}

/**
@@ -1244,7 +1243,7 @@ static int gfs2_atomic_open(struct inode *dir, struct dentry *dentry,
if (!d_in_lookup(dentry))
goto skip_lookup;

- d = __gfs2_lookup(dir, dentry, file, opened);
+ d = __gfs2_lookup(dir, dentry, file);
if (IS_ERR(d))
return PTR_ERR(d);
if (d != NULL)
@@ -1262,7 +1261,7 @@ static int gfs2_atomic_open(struct inode *dir, struct dentry *dentry,
if (!(flags & O_CREAT))
return -ENOENT;

- return gfs2_create_inode(dir, dentry, file, S_IFREG | mode, 0, NULL, 0, excl, opened);
+ return gfs2_create_inode(dir, dentry, file, S_IFREG | mode, 0, NULL, 0, excl);
}

/*
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 22176a3818d5..71ae3cc3e53a 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1434,8 +1434,7 @@ static int do_open(struct inode *inode, struct file *filp)

static int nfs_finish_open(struct nfs_open_context *ctx,
struct dentry *dentry,
- struct file *file, unsigned open_flags,
- int *opened)
+ struct file *file, unsigned open_flags)
{
int err;

@@ -1549,7 +1548,7 @@ int nfs_atomic_open(struct inode *dir, struct dentry *dentry,
goto out;
}

- err = nfs_finish_open(ctx, ctx->dentry, file, open_flags, opened);
+ err = nfs_finish_open(ctx, ctx->dentry, file, open_flags);
trace_nfs_atomic_open_exit(dir, ctx, open_flags, err);
put_nfs_open_context(ctx);
out:
--
2.11.0


2018-07-11 02:26:18

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 19/42] IMA: don't propagate opened through the entire thing

From: Al Viro <[email protected]>

just check ->f_mode in ima_appraise_measurement()

Signed-off-by: Al Viro <[email protected]>
---
fs/namei.c | 3 +--
fs/nfsd/vfs.c | 2 +-
include/linux/ima.h | 4 ++--
security/integrity/ima/ima.h | 4 ++--
security/integrity/ima/ima_appraise.c | 4 ++--
security/integrity/ima/ima_main.c | 16 ++++++++--------
6 files changed, 16 insertions(+), 17 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 33f92918e051..3abb91e23718 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3400,8 +3400,7 @@ static int do_last(struct nameidata *nd,
if (error)
goto out;
opened:
- error = ima_file_check(file, op->acc_mode,
- file->f_mode & FMODE_CREATED ? FILE_CREATED : 0);
+ error = ima_file_check(file, op->acc_mode);
if (!error && will_truncate)
error = handle_truncate(file);
out:
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index b0555d7d8200..55a099e47ba2 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -763,7 +763,7 @@ nfsd_open(struct svc_rqst *rqstp, struct svc_fh *fhp, umode_t type,
goto out_nfserr;
}

- host_err = ima_file_check(file, may_flags, 0);
+ host_err = ima_file_check(file, may_flags);
if (host_err) {
fput(file);
goto out_nfserr;
diff --git a/include/linux/ima.h b/include/linux/ima.h
index 0e4647e0eb60..d9ba3fc363b7 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -16,7 +16,7 @@ struct linux_binprm;

#ifdef CONFIG_IMA
extern int ima_bprm_check(struct linux_binprm *bprm);
-extern int ima_file_check(struct file *file, int mask, int opened);
+extern int ima_file_check(struct file *file, int mask);
extern void ima_file_free(struct file *file);
extern int ima_file_mmap(struct file *file, unsigned long prot);
extern int ima_read_file(struct file *file, enum kernel_read_file_id id);
@@ -34,7 +34,7 @@ static inline int ima_bprm_check(struct linux_binprm *bprm)
return 0;
}

-static inline int ima_file_check(struct file *file, int mask, int opened)
+static inline int ima_file_check(struct file *file, int mask)
{
return 0;
}
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 354bb5716ce3..e4c1a236976c 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -238,7 +238,7 @@ int ima_appraise_measurement(enum ima_hooks func,
struct integrity_iint_cache *iint,
struct file *file, const unsigned char *filename,
struct evm_ima_xattr_data *xattr_value,
- int xattr_len, int opened);
+ int xattr_len);
int ima_must_appraise(struct inode *inode, int mask, enum ima_hooks func);
void ima_update_xattr(struct integrity_iint_cache *iint, struct file *file);
enum integrity_status ima_get_cache_status(struct integrity_iint_cache *iint,
@@ -254,7 +254,7 @@ static inline int ima_appraise_measurement(enum ima_hooks func,
struct file *file,
const unsigned char *filename,
struct evm_ima_xattr_data *xattr_value,
- int xattr_len, int opened)
+ int xattr_len)
{
return INTEGRITY_UNKNOWN;
}
diff --git a/security/integrity/ima/ima_appraise.c b/security/integrity/ima/ima_appraise.c
index 8bd7a0733e51..deec1804a00a 100644
--- a/security/integrity/ima/ima_appraise.c
+++ b/security/integrity/ima/ima_appraise.c
@@ -212,7 +212,7 @@ int ima_appraise_measurement(enum ima_hooks func,
struct integrity_iint_cache *iint,
struct file *file, const unsigned char *filename,
struct evm_ima_xattr_data *xattr_value,
- int xattr_len, int opened)
+ int xattr_len)
{
static const char op[] = "appraise_data";
const char *cause = "unknown";
@@ -231,7 +231,7 @@ int ima_appraise_measurement(enum ima_hooks func,
cause = iint->flags & IMA_DIGSIG_REQUIRED ?
"IMA-signature-required" : "missing-hash";
status = INTEGRITY_NOLABEL;
- if (opened & FILE_CREATED)
+ if (file->f_mode & FMODE_CREATED)
iint->flags |= IMA_NEW_FILE;
if ((iint->flags & IMA_NEW_FILE) &&
(!(iint->flags & IMA_DIGSIG_REQUIRED) ||
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index dca44cf7838e..b286f37712d5 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -168,7 +168,7 @@ void ima_file_free(struct file *file)

static int process_measurement(struct file *file, const struct cred *cred,
u32 secid, char *buf, loff_t size, int mask,
- enum ima_hooks func, int opened)
+ enum ima_hooks func)
{
struct inode *inode = file_inode(file);
struct integrity_iint_cache *iint = NULL;
@@ -294,7 +294,7 @@ static int process_measurement(struct file *file, const struct cred *cred,
if (rc == 0 && (action & IMA_APPRAISE_SUBMASK)) {
inode_lock(inode);
rc = ima_appraise_measurement(func, iint, file, pathname,
- xattr_value, xattr_len, opened);
+ xattr_value, xattr_len);
inode_unlock(inode);
}
if (action & IMA_AUDIT)
@@ -338,7 +338,7 @@ int ima_file_mmap(struct file *file, unsigned long prot)
if (file && (prot & PROT_EXEC)) {
security_task_getsecid(current, &secid);
return process_measurement(file, current_cred(), secid, NULL,
- 0, MAY_EXEC, MMAP_CHECK, 0);
+ 0, MAY_EXEC, MMAP_CHECK);
}

return 0;
@@ -364,13 +364,13 @@ int ima_bprm_check(struct linux_binprm *bprm)

security_task_getsecid(current, &secid);
ret = process_measurement(bprm->file, current_cred(), secid, NULL, 0,
- MAY_EXEC, BPRM_CHECK, 0);
+ MAY_EXEC, BPRM_CHECK);
if (ret)
return ret;

security_cred_getsecid(bprm->cred, &secid);
return process_measurement(bprm->file, bprm->cred, secid, NULL, 0,
- MAY_EXEC, CREDS_CHECK, 0);
+ MAY_EXEC, CREDS_CHECK);
}

/**
@@ -383,14 +383,14 @@ int ima_bprm_check(struct linux_binprm *bprm)
* On success return 0. On integrity appraisal error, assuming the file
* is in policy and IMA-appraisal is in enforcing mode, return -EACCES.
*/
-int ima_file_check(struct file *file, int mask, int opened)
+int ima_file_check(struct file *file, int mask)
{
u32 secid;

security_task_getsecid(current, &secid);
return process_measurement(file, current_cred(), secid, NULL, 0,
mask & (MAY_READ | MAY_WRITE | MAY_EXEC |
- MAY_APPEND), FILE_CHECK, opened);
+ MAY_APPEND), FILE_CHECK);
}
EXPORT_SYMBOL_GPL(ima_file_check);

@@ -493,7 +493,7 @@ int ima_post_read_file(struct file *file, void *buf, loff_t size,
func = read_idmap[read_id] ?: FILE_CHECK;
security_task_getsecid(current, &secid);
return process_measurement(file, current_cred(), secid, buf, size,
- MAY_READ, func, 0);
+ MAY_READ, func);
}

static int __init init_ima(void)
--
2.11.0


2018-07-11 02:26:33

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 27/42] kill FILE_{CREATED,OPENED}

From: Al Viro <[email protected]>

no users left

Signed-off-by: Al Viro <[email protected]>
---
include/linux/fs.h | 4 ----
1 file changed, 4 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index c25896b30e9f..bd904c496878 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2436,10 +2436,6 @@ extern struct filename *getname(const char __user *);
extern struct filename *getname_kernel(const char *);
extern void putname(struct filename *name);

-enum {
- FILE_CREATED = 1,
- FILE_OPENED = 2
-};
extern int finish_open(struct file *file, struct dentry *dentry,
int (*open)(struct inode *, struct file *));
extern int finish_no_open(struct file *file, struct dentry *dentry);
--
2.11.0


2018-07-11 02:26:34

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 08/42] make sure do_dentry_open() won't return positive as an error

From: Al Viro <[email protected]>

An ->open() instances really, really should not be doing that. There's
a lot of places e.g. around atomic_open() that could be confused by that,
so let's catch that early.

Signed-off-by: Al Viro <[email protected]>
---
fs/open.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/fs/open.c b/fs/open.c
index 76c56966e297..558802e66e00 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -812,6 +812,10 @@ static int do_dentry_open(struct file *f,
return 0;

cleanup_all:
+ if (unlikely(error > 0)) {
+ WARN_ON(1);
+ error = -EINVAL;
+ }
fops_put(f->f_op);
if (f->f_mode & FMODE_WRITER) {
put_write_access(inode);
--
2.11.0


2018-07-11 02:26:45

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 18/42] introduce FMODE_CREATED and switch to it

From: Al Viro <[email protected]>

Parallel to FILE_CREATED, goes into ->f_mode instead of *opened.
NFS is a bit of a wart here - it doesn't have file at the point
where FILE_CREATED used to be set, so we need to propagate it
there (for now). IMA is another one (here and everywhere)...

Note that this needs do_dentry_open() to leave old bits in ->f_mode
alone - we want it to preserve FMODE_CREATED if it had been already
set (no other bit can be there).

Signed-off-by: Al Viro <[email protected]>
---
fs/9p/vfs_inode.c | 2 +-
fs/9p/vfs_inode_dotl.c | 2 +-
fs/ceph/file.c | 2 +-
fs/cifs/dir.c | 2 +-
fs/fuse/dir.c | 2 +-
fs/gfs2/inode.c | 2 +-
fs/namei.c | 15 ++++++++-------
fs/nfs/dir.c | 5 ++++-
fs/nfs/nfs4proc.c | 2 +-
fs/open.c | 2 +-
include/linux/fs.h | 1 +
11 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 42e102e2e74a..566929792480 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -925,7 +925,7 @@ v9fs_vfs_atomic_open(struct inode *dir, struct dentry *dentry,
if (v9ses->cache == CACHE_LOOSE || v9ses->cache == CACHE_FSCACHE)
v9fs_cache_inode_set_cookie(d_inode(dentry), file);

- *opened |= FILE_CREATED;
+ file->f_mode |= FMODE_CREATED;
out:
dput(res);
return err;
diff --git a/fs/9p/vfs_inode_dotl.c b/fs/9p/vfs_inode_dotl.c
index 7f6ae21a27b3..ee65db5c7eb0 100644
--- a/fs/9p/vfs_inode_dotl.c
+++ b/fs/9p/vfs_inode_dotl.c
@@ -358,7 +358,7 @@ v9fs_vfs_atomic_open_dotl(struct inode *dir, struct dentry *dentry,
file->private_data = ofid;
if (v9ses->cache == CACHE_LOOSE || v9ses->cache == CACHE_FSCACHE)
v9fs_cache_inode_set_cookie(inode, file);
- *opened |= FILE_CREATED;
+ file->f_mode |= FMODE_CREATED;
out:
v9fs_put_acl(dacl, pacl);
dput(res);
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index ad0bed99b1d5..38a63fff7903 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -507,7 +507,7 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry,
dout("atomic_open finish_open on dn %p\n", dn);
if (req->r_op == CEPH_MDS_OP_CREATE && req->r_reply_info.has_create_ino) {
ceph_init_inode_acls(d_inode(dentry), &acls);
- *opened |= FILE_CREATED;
+ file->f_mode |= FMODE_CREATED;
}
err = finish_open(file, dentry, ceph_open, opened);
}
diff --git a/fs/cifs/dir.c b/fs/cifs/dir.c
index ddae52bd1993..21d7e393900e 100644
--- a/fs/cifs/dir.c
+++ b/fs/cifs/dir.c
@@ -539,7 +539,7 @@ cifs_atomic_open(struct inode *inode, struct dentry *direntry,
}

if ((oflags & (O_CREAT | O_EXCL)) == (O_CREAT | O_EXCL))
- *opened |= FILE_CREATED;
+ file->f_mode |= FMODE_CREATED;

rc = finish_open(file, direntry, generic_file_open, opened);
if (rc) {
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 56231b31f806..d4bdcf51e6cb 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -508,7 +508,7 @@ static int fuse_atomic_open(struct inode *dir, struct dentry *entry,
goto no_open;

/* Only creates */
- *opened |= FILE_CREATED;
+ file->f_mode |= FMODE_CREATED;

if (fc->no_create)
goto mknod;
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 67c588edf8d8..4aba00a6004b 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -767,7 +767,7 @@ static int gfs2_create_inode(struct inode *dir, struct dentry *dentry,
mark_inode_dirty(inode);
d_instantiate(dentry, inode);
if (file) {
- *opened |= FILE_CREATED;
+ file->f_mode |= FMODE_CREATED;
error = finish_open(file, dentry, gfs2_open_common, opened);
}
gfs2_glock_dq_uninit(ghs);
diff --git a/fs/namei.c b/fs/namei.c
index ad8d53c3243c..33f92918e051 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3061,7 +3061,7 @@ static int atomic_open(struct nameidata *nd, struct dentry *dentry,
* permission here.
*/
int acc_mode = op->acc_mode;
- if (*opened & FILE_CREATED) {
+ if (file->f_mode & FMODE_CREATED) {
WARN_ON(!(open_flag & O_CREAT));
fsnotify_create(dir, dentry);
acc_mode = 0;
@@ -3077,7 +3077,7 @@ static int atomic_open(struct nameidata *nd, struct dentry *dentry,
dput(dentry);
dentry = file->f_path.dentry;
}
- if (*opened & FILE_CREATED)
+ if (file->f_mode & FMODE_CREATED)
fsnotify_create(dir, dentry);
if (unlikely(d_is_negative(dentry))) {
error = -ENOENT;
@@ -3126,7 +3126,7 @@ static int lookup_open(struct nameidata *nd, struct path *path,
if (unlikely(IS_DEADDIR(dir_inode)))
return -ENOENT;

- *opened &= ~FILE_CREATED;
+ file->f_mode &= ~FMODE_CREATED;
dentry = d_lookup(dir, &nd->last);
for (;;) {
if (!dentry) {
@@ -3211,7 +3211,7 @@ static int lookup_open(struct nameidata *nd, struct path *path,

/* Negative dentry, just create the file */
if (!dentry->d_inode && (open_flag & O_CREAT)) {
- *opened |= FILE_CREATED;
+ file->f_mode |= FMODE_CREATED;
audit_inode_child(dir_inode, dentry, AUDIT_TYPE_CHILD_CREATE);
if (!dir_inode->i_op->create) {
error = -EACCES;
@@ -3318,7 +3318,7 @@ static int do_last(struct nameidata *nd,
if (error)
goto out;

- if ((*opened & FILE_CREATED) ||
+ if ((file->f_mode & FMODE_CREATED) ||
!S_ISREG(file_inode(file)->i_mode))
will_truncate = false;

@@ -3326,7 +3326,7 @@ static int do_last(struct nameidata *nd,
goto opened;
}

- if (*opened & FILE_CREATED) {
+ if (file->f_mode & FMODE_CREATED) {
/* Don't check for write permission, don't truncate */
open_flag &= ~O_TRUNC;
will_truncate = false;
@@ -3400,7 +3400,8 @@ static int do_last(struct nameidata *nd,
if (error)
goto out;
opened:
- error = ima_file_check(file, op->acc_mode, *opened);
+ error = ima_file_check(file, op->acc_mode,
+ file->f_mode & FMODE_CREATED ? FILE_CREATED : 0);
if (!error && will_truncate)
error = handle_truncate(file);
out:
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 7a9c14426855..0ac50983fc4e 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1461,6 +1461,7 @@ int nfs_atomic_open(struct inode *dir, struct dentry *dentry,
struct inode *inode;
unsigned int lookup_flags = 0;
bool switched = false;
+ int created = 0;
int err;

/* Expect a negative dentry */
@@ -1521,7 +1522,9 @@ int nfs_atomic_open(struct inode *dir, struct dentry *dentry,
goto out;

trace_nfs_atomic_open_enter(dir, ctx, open_flags);
- inode = NFS_PROTO(dir)->open_context(dir, ctx, open_flags, &attr, opened);
+ inode = NFS_PROTO(dir)->open_context(dir, ctx, open_flags, &attr, &created);
+ if (created)
+ file->f_mode |= FMODE_CREATED;
if (IS_ERR(inode)) {
err = PTR_ERR(inode);
trace_nfs_atomic_open_exit(dir, ctx, open_flags, err);
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index ed45090e4df6..2c4df0ffbca1 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2951,7 +2951,7 @@ static int _nfs4_do_open(struct inode *dir,
}
}
if (opened && opendata->file_created)
- *opened |= FILE_CREATED;
+ *opened = 1;

if (pnfs_use_threshold(ctx_th, opendata->f_attr.mdsthreshold, server)) {
*ctx_th = opendata->f_attr.mdsthreshold;
diff --git a/fs/open.c b/fs/open.c
index 1f379656feab..e8ae46a740b6 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -731,7 +731,7 @@ static int do_dentry_open(struct file *f,
static const struct file_operations empty_fops = {};
int error;

- f->f_mode = OPEN_FMODE(f->f_flags) | FMODE_LSEEK |
+ f->f_mode |= OPEN_FMODE(f->f_flags) | FMODE_LSEEK |
FMODE_PREAD | FMODE_PWRITE;

path_get(&f->f_path);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 05f34726e29c..ca668c7e48a7 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -149,6 +149,7 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
#define FMODE_CAN_WRITE ((__force fmode_t)0x40000)

#define FMODE_OPENED ((__force fmode_t)0x80000)
+#define FMODE_CREATED ((__force fmode_t)0x100000)

/* File was opened by fanotify and shouldn't generate fanotify events */
#define FMODE_NONOTIFY ((__force fmode_t)0x4000000)
--
2.11.0


2018-07-11 02:26:49

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 17/42] switch all remaining checks for FILE_OPENED to FMODE_OPENED

From: Al Viro <[email protected]>

... and don't bother with setting FILE_OPENED at all.

Signed-off-by: Al Viro <[email protected]>
---
fs/gfs2/inode.c | 2 +-
fs/namei.c | 7 ++-----
fs/open.c | 9 ++-------
3 files changed, 5 insertions(+), 13 deletions(-)

diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index feda55f67050..67c588edf8d8 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -1250,7 +1250,7 @@ static int gfs2_atomic_open(struct inode *dir, struct dentry *dentry,
if (d != NULL)
dentry = d;
if (d_really_is_positive(dentry)) {
- if (!(*opened & FILE_OPENED))
+ if (!(file->f_mode & FMODE_OPENED))
return finish_no_open(file, d);
dput(d);
return 0;
diff --git a/fs/namei.c b/fs/namei.c
index 39b5da70d6f5..ad8d53c3243c 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3395,11 +3395,10 @@ static int do_last(struct nameidata *nd,
error = may_open(&nd->path, acc_mode, open_flag);
if (error)
goto out;
- BUG_ON(*opened & FILE_OPENED); /* once it's opened, it's opened */
+ BUG_ON(file->f_mode & FMODE_OPENED); /* once it's opened, it's opened */
error = vfs_open(&nd->path, file);
if (error)
goto out;
- *opened |= FILE_OPENED;
opened:
error = ima_file_check(file, op->acc_mode, *opened);
if (!error && will_truncate)
@@ -3517,8 +3516,6 @@ static struct file *path_openat(struct nameidata *nd,

if (unlikely(file->f_flags & O_PATH)) {
error = do_o_path(nd, flags, file);
- if (!error)
- opened |= FILE_OPENED;
goto out2;
}

@@ -3539,7 +3536,7 @@ static struct file *path_openat(struct nameidata *nd,
terminate_walk(nd);
out2:
if (likely(!error)) {
- if (likely(opened & FILE_OPENED))
+ if (likely(file->f_mode & FMODE_OPENED))
return file;
WARN_ON(1);
error = -EINVAL;
diff --git a/fs/open.c b/fs/open.c
index 2aa3dc03e2fd..1f379656feab 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -849,15 +849,10 @@ int finish_open(struct file *file, struct dentry *dentry,
int (*open)(struct inode *, struct file *),
int *opened)
{
- int error;
- BUG_ON(*opened & FILE_OPENED); /* once it's opened, it's opened */
+ BUG_ON(file->f_mode & FMODE_OPENED); /* once it's opened, it's opened */

file->f_path.dentry = dentry;
- error = do_dentry_open(file, d_backing_inode(dentry), open);
- if (!error)
- *opened |= FILE_OPENED;
-
- return error;
+ return do_dentry_open(file, d_backing_inode(dentry), open);
}
EXPORT_SYMBOL(finish_open);

--
2.11.0


2018-07-11 02:27:01

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 03/42] ocxlflash_getfile(): fix double-iput() on alloc_file() failures

From: Al Viro <[email protected]>

Cc: [email protected]
Signed-off-by: Al Viro <[email protected]>
---
drivers/scsi/cxlflash/ocxl_hw.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/cxlflash/ocxl_hw.c b/drivers/scsi/cxlflash/ocxl_hw.c
index 0a95b5f25380..497a68389461 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.c
+++ b/drivers/scsi/cxlflash/ocxl_hw.c
@@ -134,15 +134,14 @@ static struct file *ocxlflash_getfile(struct device *dev, const char *name,
rc = PTR_ERR(file);
dev_err(dev, "%s: alloc_file failed rc=%d\n",
__func__, rc);
- goto err5;
+ path_put(&path);
+ goto err3;
}

file->f_flags = flags & (O_ACCMODE | O_NONBLOCK);
file->private_data = priv;
out:
return file;
-err5:
- path_put(&path);
err4:
iput(inode);
err3:
--
2.11.0


2018-07-11 02:27:06

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 20/42] getting rid of 'opened' argument of ->atomic_open() - step 1

From: Al Viro <[email protected]>

'opened' argument of finish_open() is unused. Kill it.

Signed-off-by Al Viro <[email protected]>
---
fs/9p/vfs_inode.c | 2 +-
fs/9p/vfs_inode_dotl.c | 2 +-
fs/ceph/file.c | 2 +-
fs/cifs/dir.c | 2 +-
fs/fuse/dir.c | 2 +-
fs/gfs2/inode.c | 6 +++---
fs/namei.c | 2 +-
fs/nfs/dir.c | 2 +-
fs/open.c | 3 +--
include/linux/fs.h | 3 +--
10 files changed, 12 insertions(+), 14 deletions(-)

diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 566929792480..7b6ff3275d9c 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -917,7 +917,7 @@ v9fs_vfs_atomic_open(struct inode *dir, struct dentry *dentry,
v9inode->writeback_fid = (void *) inode_fid;
}
mutex_unlock(&v9inode->v_mutex);
- err = finish_open(file, dentry, generic_file_open, opened);
+ err = finish_open(file, dentry, generic_file_open);
if (err)
goto error;

diff --git a/fs/9p/vfs_inode_dotl.c b/fs/9p/vfs_inode_dotl.c
index ee65db5c7eb0..c6939b7cb18c 100644
--- a/fs/9p/vfs_inode_dotl.c
+++ b/fs/9p/vfs_inode_dotl.c
@@ -352,7 +352,7 @@ v9fs_vfs_atomic_open_dotl(struct inode *dir, struct dentry *dentry,
}
mutex_unlock(&v9inode->v_mutex);
/* Since we are opening a file, assign the open fid to the file */
- err = finish_open(file, dentry, generic_file_open, opened);
+ err = finish_open(file, dentry, generic_file_open);
if (err)
goto err_clunk_old_fid;
file->private_data = ofid;
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 38a63fff7903..38b28cb2fac1 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -509,7 +509,7 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry,
ceph_init_inode_acls(d_inode(dentry), &acls);
file->f_mode |= FMODE_CREATED;
}
- err = finish_open(file, dentry, ceph_open, opened);
+ err = finish_open(file, dentry, ceph_open);
}
out_req:
if (!req->r_err && req->r_target_inode)
diff --git a/fs/cifs/dir.c b/fs/cifs/dir.c
index 21d7e393900e..891bfd62e67a 100644
--- a/fs/cifs/dir.c
+++ b/fs/cifs/dir.c
@@ -541,7 +541,7 @@ cifs_atomic_open(struct inode *inode, struct dentry *direntry,
if ((oflags & (O_CREAT | O_EXCL)) == (O_CREAT | O_EXCL))
file->f_mode |= FMODE_CREATED;

- rc = finish_open(file, direntry, generic_file_open, opened);
+ rc = finish_open(file, direntry, generic_file_open);
if (rc) {
if (server->ops->close)
server->ops->close(xid, tcon, &fid);
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index d4bdcf51e6cb..a5b1f5ff8cb7 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -469,7 +469,7 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
d_instantiate(entry, inode);
fuse_change_entry_timeout(entry, &outentry);
fuse_invalidate_attr(dir);
- err = finish_open(file, entry, generic_file_open, opened);
+ err = finish_open(file, entry, generic_file_open);
if (err) {
fuse_sync_release(ff, flags);
} else {
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 4aba00a6004b..59f695e96d63 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -626,7 +626,7 @@ static int gfs2_create_inode(struct inode *dir, struct dentry *dentry,
error = 0;
if (file) {
if (S_ISREG(inode->i_mode))
- error = finish_open(file, dentry, gfs2_open_common, opened);
+ error = finish_open(file, dentry, gfs2_open_common);
else
error = finish_no_open(file, NULL);
}
@@ -768,7 +768,7 @@ static int gfs2_create_inode(struct inode *dir, struct dentry *dentry,
d_instantiate(dentry, inode);
if (file) {
file->f_mode |= FMODE_CREATED;
- error = finish_open(file, dentry, gfs2_open_common, opened);
+ error = finish_open(file, dentry, gfs2_open_common);
}
gfs2_glock_dq_uninit(ghs);
gfs2_glock_dq_uninit(ghs + 1);
@@ -866,7 +866,7 @@ static struct dentry *__gfs2_lookup(struct inode *dir, struct dentry *dentry,
return d;
}
if (file && S_ISREG(inode->i_mode))
- error = finish_open(file, dentry, gfs2_open_common, opened);
+ error = finish_open(file, dentry, gfs2_open_common);

gfs2_glock_dq_uninit(&gh);
if (error) {
diff --git a/fs/namei.c b/fs/namei.c
index 3abb91e23718..cc412d0620a3 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3475,7 +3475,7 @@ static int do_tmpfile(struct nameidata *nd, unsigned flags,
if (error)
goto out2;
file->f_path.mnt = path.mnt;
- error = finish_open(file, child, NULL, opened);
+ error = finish_open(file, child, NULL);
out2:
mnt_drop_write(path.mnt);
out:
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 0ac50983fc4e..22176a3818d5 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1439,7 +1439,7 @@ static int nfs_finish_open(struct nfs_open_context *ctx,
{
int err;

- err = finish_open(file, dentry, do_open, opened);
+ err = finish_open(file, dentry, do_open);
if (err)
goto out;
if (S_ISREG(file->f_path.dentry->d_inode->i_mode))
diff --git a/fs/open.c b/fs/open.c
index e8ae46a740b6..a46ba6647562 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -846,8 +846,7 @@ static int do_dentry_open(struct file *f,
* Returns zero on success or -errno if the open failed.
*/
int finish_open(struct file *file, struct dentry *dentry,
- int (*open)(struct inode *, struct file *),
- int *opened)
+ int (*open)(struct inode *, struct file *))
{
BUG_ON(file->f_mode & FMODE_OPENED); /* once it's opened, it's opened */

diff --git a/include/linux/fs.h b/include/linux/fs.h
index ca668c7e48a7..70be3e4c26ac 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2441,8 +2441,7 @@ enum {
FILE_OPENED = 2
};
extern int finish_open(struct file *file, struct dentry *dentry,
- int (*open)(struct inode *, struct file *),
- int *opened);
+ int (*open)(struct inode *, struct file *));
extern int finish_no_open(struct file *file, struct dentry *dentry);

/* fs/ioctl.c */
--
2.11.0


2018-07-11 02:27:09

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 09/42] pass creds to get_empty_filp(), make sure dentry_open() passes the right creds

From: Al Viro <[email protected]>

... and rename get_empty_filp() to alloc_empty_file().

dentry_open() gets creds as argument, but the only thing that sees those is
security_file_open() - file->f_cred still ends up with current_cred(). For
almost all callers it's the same thing, but there are several broken cases.

Signed-off-by: Al Viro <[email protected]>
---
fs/file_table.c | 5 ++---
fs/internal.h | 2 +-
fs/namei.c | 2 +-
fs/open.c | 2 +-
4 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index eee7cf629e52..d7a03a47b702 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -101,9 +101,8 @@ int proc_nr_files(struct ctl_table *table, int write,
* done, you will imbalance int the mount's writer count
* and a warning at __fput() time.
*/
-struct file *get_empty_filp(void)
+struct file *alloc_empty_file(const struct cred *cred)
{
- const struct cred *cred = current_cred();
static long old_max;
struct file *f;
int error;
@@ -161,7 +160,7 @@ struct file *alloc_file(const struct path *path, fmode_t mode,
{
struct file *file;

- file = get_empty_filp();
+ file = alloc_empty_file(current_cred());
if (IS_ERR(file))
return file;

diff --git a/fs/internal.h b/fs/internal.h
index 5645b4ebf494..66473bf388e4 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -93,7 +93,7 @@ extern void chroot_fs_refs(const struct path *, const struct path *);
/*
* file_table.c
*/
-extern struct file *get_empty_filp(void);
+extern struct file *alloc_empty_file(const struct cred *);

/*
* super.c
diff --git a/fs/namei.c b/fs/namei.c
index 734cef54fdf8..af2ec1803f57 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3513,7 +3513,7 @@ static struct file *path_openat(struct nameidata *nd,
int opened = 0;
int error;

- file = get_empty_filp();
+ file = alloc_empty_file(current_cred());
if (IS_ERR(file))
return file;

diff --git a/fs/open.c b/fs/open.c
index 558802e66e00..c0dbc67c31f1 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -923,7 +923,7 @@ struct file *dentry_open(const struct path *path, int flags,
/* We must always pass in a valid mount pointer. */
BUG_ON(!path->mnt);

- f = get_empty_filp();
+ f = alloc_empty_file(cred);
if (!IS_ERR(f)) {
f->f_flags = flags;
error = vfs_open(path, f, cred);
--
2.11.0


2018-07-11 02:27:20

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 12/42] ->file_open(): lose cred argument

From: Al Viro <[email protected]>

Signed-off-by: Al Viro <[email protected]>
---
include/linux/lsm_hooks.h | 2 +-
security/apparmor/lsm.c | 4 ++--
security/security.c | 2 +-
security/selinux/hooks.c | 4 ++--
security/smack/smack_lsm.c | 6 +++---
security/tomoyo/tomoyo.c | 2 +-
6 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 8f1131c8dd54..a8ee106b865d 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -1569,7 +1569,7 @@ union security_list_options {
int (*file_send_sigiotask)(struct task_struct *tsk,
struct fown_struct *fown, int sig);
int (*file_receive)(struct file *file);
- int (*file_open)(struct file *file, const struct cred *cred);
+ int (*file_open)(struct file *file);

int (*task_alloc)(struct task_struct *task, unsigned long clone_flags);
void (*task_free)(struct task_struct *task);
diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
index 74f17376202b..8b8b70620bbe 100644
--- a/security/apparmor/lsm.c
+++ b/security/apparmor/lsm.c
@@ -395,7 +395,7 @@ static int apparmor_inode_getattr(const struct path *path)
return common_perm_cond(OP_GETATTR, path, AA_MAY_GETATTR);
}

-static int apparmor_file_open(struct file *file, const struct cred *cred)
+static int apparmor_file_open(struct file *file)
{
struct aa_file_ctx *fctx = file_ctx(file);
struct aa_label *label;
@@ -414,7 +414,7 @@ static int apparmor_file_open(struct file *file, const struct cred *cred)
return 0;
}

- label = aa_get_newest_cred_label(cred);
+ label = aa_get_newest_cred_label(file->f_cred);
if (!unconfined(label)) {
struct inode *inode = file_inode(file);
struct path_cond cond = { inode->i_uid, inode->i_mode };
diff --git a/security/security.c b/security/security.c
index 235b35f58a65..5dce67070cdf 100644
--- a/security/security.c
+++ b/security/security.c
@@ -974,7 +974,7 @@ int security_file_open(struct file *file)
{
int ret;

- ret = call_int_hook(file_open, 0, file, file->f_cred);
+ ret = call_int_hook(file_open, 0, file);
if (ret)
return ret;

diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 2b5ee5fbd652..18006be15713 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -3862,7 +3862,7 @@ static int selinux_file_receive(struct file *file)
return file_has_perm(cred, file, file_to_av(file));
}

-static int selinux_file_open(struct file *file, const struct cred *cred)
+static int selinux_file_open(struct file *file)
{
struct file_security_struct *fsec;
struct inode_security_struct *isec;
@@ -3886,7 +3886,7 @@ static int selinux_file_open(struct file *file, const struct cred *cred)
* new inode label or new policy.
* This check is not redundant - do not remove.
*/
- return file_path_has_perm(cred, file, open_file_to_av(file));
+ return file_path_has_perm(file->f_cred, file, open_file_to_av(file));
}

/* task security operations */
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index 7ad226018f51..e7b6c012431d 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -1927,9 +1927,9 @@ static int smack_file_receive(struct file *file)
*
* Returns 0
*/
-static int smack_file_open(struct file *file, const struct cred *cred)
+static int smack_file_open(struct file *file)
{
- struct task_smack *tsp = cred->security;
+ struct task_smack *tsp = file->f_cred->security;
struct inode *inode = file_inode(file);
struct smk_audit_info ad;
int rc;
@@ -1937,7 +1937,7 @@ static int smack_file_open(struct file *file, const struct cred *cred)
smk_ad_init(&ad, __func__, LSM_AUDIT_DATA_PATH);
smk_ad_setfield_u_fs_path(&ad, file->f_path);
rc = smk_tskacc(tsp, smk_of_inode(inode), MAY_READ, &ad);
- rc = smk_bu_credfile(cred, file, MAY_READ, rc);
+ rc = smk_bu_credfile(file->f_cred, file, MAY_READ, rc);

return rc;
}
diff --git a/security/tomoyo/tomoyo.c b/security/tomoyo/tomoyo.c
index 213b8c593668..9f932e2d6852 100644
--- a/security/tomoyo/tomoyo.c
+++ b/security/tomoyo/tomoyo.c
@@ -320,7 +320,7 @@ static int tomoyo_file_fcntl(struct file *file, unsigned int cmd,
*
* Returns 0 on success, negative value otherwise.
*/
-static int tomoyo_file_open(struct file *f, const struct cred *cred)
+static int tomoyo_file_open(struct file *f)
{
int flags = f->f_flags;
/* Don't check read permission here if called from do_execve(). */
--
2.11.0


2018-07-11 02:27:22

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 10/42] get rid of cred argument of vfs_open() and do_dentry_open()

From: Al Viro <[email protected]>

always equal to ->f_cred

Signed-off-by: Al Viro <[email protected]>
---
fs/internal.h | 2 +-
fs/namei.c | 4 ++--
fs/open.c | 15 ++++++---------
3 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/fs/internal.h b/fs/internal.h
index 66473bf388e4..ab84a29f4874 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -126,7 +126,7 @@ int do_fchownat(int dfd, const char __user *filename, uid_t user, gid_t group,
int flag);

extern int open_check_o_direct(struct file *f);
-extern int vfs_open(const struct path *, struct file *, const struct cred *);
+extern int vfs_open(const struct path *, struct file *);

/*
* inode.c
diff --git a/fs/namei.c b/fs/namei.c
index af2ec1803f57..7b1fe5e30a0d 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3396,7 +3396,7 @@ static int do_last(struct nameidata *nd,
if (error)
goto out;
BUG_ON(*opened & FILE_OPENED); /* once it's opened, it's opened */
- error = vfs_open(&nd->path, file, current_cred());
+ error = vfs_open(&nd->path, file);
if (error)
goto out;
*opened |= FILE_OPENED;
@@ -3499,7 +3499,7 @@ static int do_o_path(struct nameidata *nd, unsigned flags, struct file *file)
int error = path_lookupat(nd, flags, &path);
if (!error) {
audit_inode(nd->name, path.dentry, 0);
- error = vfs_open(&path, file, current_cred());
+ error = vfs_open(&path, file);
path_put(&path);
}
return error;
diff --git a/fs/open.c b/fs/open.c
index c0dbc67c31f1..c8fd5126c50e 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -736,8 +736,7 @@ int open_check_o_direct(struct file *f)

static int do_dentry_open(struct file *f,
struct inode *inode,
- int (*open)(struct inode *, struct file *),
- const struct cred *cred)
+ int (*open)(struct inode *, struct file *))
{
static const struct file_operations empty_fops = {};
int error;
@@ -780,7 +779,7 @@ static int do_dentry_open(struct file *f,
goto cleanup_all;
}

- error = security_file_open(f, cred);
+ error = security_file_open(f, f->f_cred);
if (error)
goto cleanup_all;

@@ -858,8 +857,7 @@ int finish_open(struct file *file, struct dentry *dentry,
BUG_ON(*opened & FILE_OPENED); /* once it's opened, it's opened */

file->f_path.dentry = dentry;
- error = do_dentry_open(file, d_backing_inode(dentry), open,
- current_cred());
+ error = do_dentry_open(file, d_backing_inode(dentry), open);
if (!error)
*opened |= FILE_OPENED;

@@ -900,8 +898,7 @@ EXPORT_SYMBOL(file_path);
* @file: newly allocated file with f_flag initialized
* @cred: credentials to use
*/
-int vfs_open(const struct path *path, struct file *file,
- const struct cred *cred)
+int vfs_open(const struct path *path, struct file *file)
{
struct dentry *dentry = d_real(path->dentry, NULL, file->f_flags, 0);

@@ -909,7 +906,7 @@ int vfs_open(const struct path *path, struct file *file,
return PTR_ERR(dentry);

file->f_path = *path;
- return do_dentry_open(file, d_backing_inode(dentry), NULL, cred);
+ return do_dentry_open(file, d_backing_inode(dentry), NULL);
}

struct file *dentry_open(const struct path *path, int flags,
@@ -926,7 +923,7 @@ struct file *dentry_open(const struct path *path, int flags,
f = alloc_empty_file(cred);
if (!IS_ERR(f)) {
f->f_flags = flags;
- error = vfs_open(path, f, cred);
+ error = vfs_open(path, f);
if (!error) {
/* from now on we need fput() to dispose of f */
error = open_check_o_direct(f);
--
2.11.0


2018-07-11 02:27:31

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 14/42] fold put_filp() into fput()

From: Al Viro <[email protected]>

Just check FMODE_OPENED in __fput() and be done with that...

Signed-off-by: Al Viro <[email protected]>
---
fs/file_table.c | 15 +++++----------
fs/namei.c | 4 ++--
fs/open.c | 11 +++--------
include/linux/file.h | 1 -
4 files changed, 10 insertions(+), 21 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index eaee481295de..d55dc579ae4e 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -190,6 +190,9 @@ static void __fput(struct file *file)
struct vfsmount *mnt = file->f_path.mnt;
struct inode *inode = file->f_inode;

+ if (unlikely(!(file->f_mode & FMODE_OPENED)))
+ goto out;
+
might_sleep();

fsnotify_close(file);
@@ -219,12 +222,10 @@ static void __fput(struct file *file)
put_write_access(inode);
__mnt_drop_write(mnt);
}
- file->f_path.dentry = NULL;
- file->f_path.mnt = NULL;
- file->f_inode = NULL;
- file_free(file);
dput(dentry);
mntput(mnt);
+out:
+ file_free(file);
}

static LLIST_HEAD(delayed_fput_list);
@@ -299,12 +300,6 @@ void __fput_sync(struct file *file)

EXPORT_SYMBOL(fput);

-void put_filp(struct file *file)
-{
- if (atomic_long_dec_and_test(&file->f_count))
- file_free(file);
-}
-
void __init files_init(void)
{
filp_cachep = kmem_cache_create("filp", sizeof(struct file), 0,
diff --git a/fs/namei.c b/fs/namei.c
index 7b1fe5e30a0d..3d4e20a1c756 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3533,7 +3533,7 @@ static struct file *path_openat(struct nameidata *nd,

s = path_init(nd, flags);
if (IS_ERR(s)) {
- put_filp(file);
+ fput(file);
return ERR_CAST(s);
}
while (!(error = link_path_walk(s, nd)) &&
@@ -3549,7 +3549,7 @@ static struct file *path_openat(struct nameidata *nd,
out2:
if (!(opened & FILE_OPENED)) {
BUG_ON(!error);
- put_filp(file);
+ fput(file);
}
if (unlikely(error)) {
if (error == -EOPENSTALE) {
diff --git a/fs/open.c b/fs/open.c
index ce092dda5472..affdeebe5fd5 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -925,15 +925,10 @@ struct file *dentry_open(const struct path *path, int flags,
if (!IS_ERR(f)) {
f->f_flags = flags;
error = vfs_open(path, f);
- if (!error) {
- /* from now on we need fput() to dispose of f */
+ if (!error)
error = open_check_o_direct(f);
- if (error) {
- fput(f);
- f = ERR_PTR(error);
- }
- } else {
- put_filp(f);
+ if (error) {
+ fput(f);
f = ERR_PTR(error);
}
}
diff --git a/include/linux/file.h b/include/linux/file.h
index 279720db984a..d771bc05da77 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -78,7 +78,6 @@ extern int f_dupfd(unsigned int from, struct file *file, unsigned flags);
extern int replace_fd(unsigned fd, struct file *file, unsigned flags);
extern void set_close_on_exec(unsigned int fd, int flag);
extern bool get_close_on_exec(unsigned int fd);
-extern void put_filp(struct file *);
extern int get_unused_fd_flags(unsigned flags);
extern void put_unused_fd(unsigned int fd);

--
2.11.0


2018-07-11 02:27:36

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 06/42] turn filp_clone_open() into inline wrapper for dentry_open()

From: Al Viro <[email protected]>

it's exactly the same thing as
dentry_open(&file->f_path, file->f_flags, file->f_cred)

... and rename it to file_clone_open(), while we are at it.
'filp' naming convention is bogus; sure, it's "file pointer",
but we generally don't do that kind of Hungarian notation.
Some of the instances have too many callers to touch, but this
one has only two, so let's sanitize it while we can...

Signed-off-by: Al Viro <[email protected]>
---
drivers/gpu/drm/drm_lease.c | 2 +-
fs/binfmt_misc.c | 2 +-
fs/open.c | 20 --------------------
include/linux/fs.h | 5 ++++-
4 files changed, 6 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/drm_lease.c b/drivers/gpu/drm/drm_lease.c
index d638c0fb3418..b54fb78a283c 100644
--- a/drivers/gpu/drm/drm_lease.c
+++ b/drivers/gpu/drm/drm_lease.c
@@ -553,7 +553,7 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev,

/* Clone the lessor file to create a new file for us */
DRM_DEBUG_LEASE("Allocating lease file\n");
- lessee_file = filp_clone_open(lessor_file);
+ lessee_file = file_clone_open(lessor_file);
if (IS_ERR(lessee_file)) {
ret = PTR_ERR(lessee_file);
goto out_lessee;
diff --git a/fs/binfmt_misc.c b/fs/binfmt_misc.c
index 4b5fff31ef27..aa4a7a23ff99 100644
--- a/fs/binfmt_misc.c
+++ b/fs/binfmt_misc.c
@@ -205,7 +205,7 @@ static int load_misc_binary(struct linux_binprm *bprm)
goto error;

if (fmt->flags & MISC_FMT_OPEN_FILE) {
- interp_file = filp_clone_open(fmt->interp_file);
+ interp_file = file_clone_open(fmt->interp_file);
if (!IS_ERR(interp_file))
deny_write_access(interp_file);
} else {
diff --git a/fs/open.c b/fs/open.c
index d0e955b558ad..76c56966e297 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -1063,26 +1063,6 @@ struct file *file_open_root(struct dentry *dentry, struct vfsmount *mnt,
}
EXPORT_SYMBOL(file_open_root);

-struct file *filp_clone_open(struct file *oldfile)
-{
- struct file *file;
- int retval;
-
- file = get_empty_filp();
- if (IS_ERR(file))
- return file;
-
- file->f_flags = oldfile->f_flags;
- retval = vfs_open(&oldfile->f_path, file, oldfile->f_cred);
- if (retval) {
- put_filp(file);
- return ERR_PTR(retval);
- }
-
- return file;
-}
-EXPORT_SYMBOL(filp_clone_open);
-
long do_sys_open(int dfd, const char __user *filename, int flags, umode_t mode)
{
struct open_flags op;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index aa9b4c169ed2..c4ca4c9c1130 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2422,7 +2422,10 @@ extern struct file *filp_open(const char *, int, umode_t);
extern struct file *file_open_root(struct dentry *, struct vfsmount *,
const char *, int, umode_t);
extern struct file * dentry_open(const struct path *, int, const struct cred *);
-extern struct file *filp_clone_open(struct file *);
+static inline struct file *file_clone_open(struct file *file)
+{
+ return dentry_open(&file->f_path, file->f_flags, file->f_cred);
+}
extern int filp_close(struct file *, fl_owner_t id);

extern struct filename *getname_flags(const char __user *, int, int *);
--
2.11.0


2018-07-11 02:27:44

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 01/42] drm_mode_create_lease_ioctl(): fix open-coded filp_clone_open()

From: Al Viro <[email protected]>

Failure of ->open() should *not* be followed by fput(). Fixed by
using filp_clone_open(), which gets the cleanups right.

Signed-off-by: Al Viro <[email protected]>
---
drivers/gpu/drm/drm_lease.c | 16 +---------------
fs/internal.h | 1 -
include/linux/fs.h | 1 +
3 files changed, 2 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/drm_lease.c b/drivers/gpu/drm/drm_lease.c
index 50c73c0a20b9..d638c0fb3418 100644
--- a/drivers/gpu/drm/drm_lease.c
+++ b/drivers/gpu/drm/drm_lease.c
@@ -553,24 +553,13 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev,

/* Clone the lessor file to create a new file for us */
DRM_DEBUG_LEASE("Allocating lease file\n");
- path_get(&lessor_file->f_path);
- lessee_file = alloc_file(&lessor_file->f_path,
- lessor_file->f_mode,
- fops_get(lessor_file->f_inode->i_fop));
-
+ lessee_file = filp_clone_open(lessor_file);
if (IS_ERR(lessee_file)) {
ret = PTR_ERR(lessee_file);
goto out_lessee;
}

- /* Initialize the new file for DRM */
- DRM_DEBUG_LEASE("Initializing the file with %p\n", lessee_file->f_op->open);
- ret = lessee_file->f_op->open(lessee_file->f_inode, lessee_file);
- if (ret)
- goto out_lessee_file;
-
lessee_priv = lessee_file->private_data;
-
/* Change the file to a master one */
drm_master_put(&lessee_priv->master);
lessee_priv->master = lessee;
@@ -588,9 +577,6 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev,
DRM_DEBUG_LEASE("drm_mode_create_lease_ioctl succeeded\n");
return 0;

-out_lessee_file:
- fput(lessee_file);
-
out_lessee:
drm_master_put(&lessee);

diff --git a/fs/internal.h b/fs/internal.h
index 980d005b21b4..5645b4ebf494 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -127,7 +127,6 @@ int do_fchownat(int dfd, const char __user *filename, uid_t user, gid_t group,

extern int open_check_o_direct(struct file *f);
extern int vfs_open(const struct path *, struct file *, const struct cred *);
-extern struct file *filp_clone_open(struct file *);

/*
* inode.c
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 5c91108846db..aa9b4c169ed2 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2422,6 +2422,7 @@ extern struct file *filp_open(const char *, int, umode_t);
extern struct file *file_open_root(struct dentry *, struct vfsmount *,
const char *, int, umode_t);
extern struct file * dentry_open(const struct path *, int, const struct cred *);
+extern struct file *filp_clone_open(struct file *);
extern int filp_close(struct file *, fl_owner_t id);

extern struct filename *getname_flags(const char __user *, int, int *);
--
2.11.0


2018-07-11 02:27:47

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 15/42] lift fput() on late failures into path_openat()

From: Al Viro <[email protected]>

Signed-off-by: Al Viro <[email protected]>
---
fs/namei.c | 28 ++++++++++++----------------
1 file changed, 12 insertions(+), 16 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 3d4e20a1c756..5a79bd170410 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3407,8 +3407,6 @@ static int do_last(struct nameidata *nd,
if (!error && will_truncate)
error = handle_truncate(file);
out:
- if (unlikely(error) && (*opened & FILE_OPENED))
- fput(file);
if (unlikely(error > 0)) {
WARN_ON(1);
error = -EINVAL;
@@ -3484,8 +3482,6 @@ static int do_tmpfile(struct nameidata *nd, unsigned flags,
if (error)
goto out2;
error = open_check_o_direct(file);
- if (error)
- fput(file);
out2:
mnt_drop_write(path.mnt);
out:
@@ -3547,20 +3543,20 @@ static struct file *path_openat(struct nameidata *nd,
}
terminate_walk(nd);
out2:
- if (!(opened & FILE_OPENED)) {
- BUG_ON(!error);
- fput(file);
+ if (likely(!error)) {
+ if (likely(opened & FILE_OPENED))
+ return file;
+ WARN_ON(1);
+ error = -EINVAL;
}
- if (unlikely(error)) {
- if (error == -EOPENSTALE) {
- if (flags & LOOKUP_RCU)
- error = -ECHILD;
- else
- error = -ESTALE;
- }
- file = ERR_PTR(error);
+ fput(file);
+ if (error == -EOPENSTALE) {
+ if (flags & LOOKUP_RCU)
+ error = -ECHILD;
+ else
+ error = -ESTALE;
}
- return file;
+ return ERR_PTR(error);
}

struct file *do_filp_open(int dfd, struct filename *pathname,
--
2.11.0


2018-07-11 02:28:00

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 13/42] introduce FMODE_OPENED

From: Al Viro <[email protected]>

basically, "is that instance set up enough for regular fput(), or
do we want put_filp() for that one".

NOTE: the only alloc_file() caller that could be followed by put_filp()
is in arch/ia64/kernel/perfmon.c, which is (Kconfig-level) broken.

Signed-off-by: Al Viro <[email protected]>
---
fs/file_table.c | 2 +-
fs/open.c | 3 ++-
include/linux/fs.h | 2 ++
3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index d7a03a47b702..eaee481295de 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -174,7 +174,7 @@ struct file *alloc_file(const struct path *path, fmode_t mode,
if ((mode & FMODE_WRITE) &&
likely(fop->write || fop->write_iter))
mode |= FMODE_CAN_WRITE;
- file->f_mode = mode;
+ file->f_mode = mode | FMODE_OPENED;
file->f_op = fop;
if ((mode & (FMODE_READ | FMODE_WRITE)) == FMODE_READ)
i_readcount_inc(path->dentry->d_inode);
diff --git a/fs/open.c b/fs/open.c
index 008a65e82de5..ce092dda5472 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -752,7 +752,7 @@ static int do_dentry_open(struct file *f,
f->f_wb_err = filemap_sample_wb_err(f->f_mapping);

if (unlikely(f->f_flags & O_PATH)) {
- f->f_mode = FMODE_PATH;
+ f->f_mode = FMODE_PATH | FMODE_OPENED;
f->f_op = &empty_fops;
return 0;
}
@@ -794,6 +794,7 @@ static int do_dentry_open(struct file *f,
if (error)
goto cleanup_all;
}
+ f->f_mode |= FMODE_OPENED;
if ((f->f_mode & (FMODE_READ | FMODE_WRITE)) == FMODE_READ)
i_readcount_inc(inode);
if ((f->f_mode & FMODE_READ) &&
diff --git a/include/linux/fs.h b/include/linux/fs.h
index c4ca4c9c1130..05f34726e29c 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -148,6 +148,8 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
/* Has write method(s) */
#define FMODE_CAN_WRITE ((__force fmode_t)0x40000)

+#define FMODE_OPENED ((__force fmode_t)0x80000)
+
/* File was opened by fanotify and shouldn't generate fanotify events */
#define FMODE_NONOTIFY ((__force fmode_t)0x4000000)

--
2.11.0


2018-07-11 02:28:15

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 07/42] create_pipe_files(): use fput() if allocation of the second file fails

From: Al Viro <[email protected]>

... just use put_pipe_info() to get the pipe->files down to 1 and let
fput()-called pipe_release() do freeing.

Signed-off-by: Al Viro <[email protected]>
---
fs/pipe.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/pipe.c b/fs/pipe.c
index bb0840e234f3..9405e455f5b1 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -771,8 +771,9 @@ int create_pipe_files(struct file **res, int flags)

res[0] = alloc_file(&path, FMODE_READ, &pipefifo_fops);
if (IS_ERR(res[0])) {
- err = PTR_ERR(res[0]);
- goto err_file;
+ put_pipe_info(inode, inode->i_pipe);
+ fput(f);
+ return PTR_ERR(res[0]);
}

path_get(&path);
@@ -781,8 +782,6 @@ int create_pipe_files(struct file **res, int flags)
res[1] = f;
return 0;

-err_file:
- put_filp(f);
err_dentry:
free_pipe_info(inode->i_pipe);
path_put(&path);
--
2.11.0


2018-07-11 02:28:14

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 11/42] security_file_open(): lose cred argument

From: Al Viro <[email protected]>

Signed-off-by: Al Viro <[email protected]>
---
fs/open.c | 2 +-
include/linux/security.h | 5 ++---
security/security.c | 4 ++--
3 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/fs/open.c b/fs/open.c
index c8fd5126c50e..008a65e82de5 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -779,7 +779,7 @@ static int do_dentry_open(struct file *f,
goto cleanup_all;
}

- error = security_file_open(f, f->f_cred);
+ error = security_file_open(f);
if (error)
goto cleanup_all;

diff --git a/include/linux/security.h b/include/linux/security.h
index 63030c85ee19..88d30fc975e7 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -309,7 +309,7 @@ void security_file_set_fowner(struct file *file);
int security_file_send_sigiotask(struct task_struct *tsk,
struct fown_struct *fown, int sig);
int security_file_receive(struct file *file);
-int security_file_open(struct file *file, const struct cred *cred);
+int security_file_open(struct file *file);
int security_task_alloc(struct task_struct *task, unsigned long clone_flags);
void security_task_free(struct task_struct *task);
int security_cred_alloc_blank(struct cred *cred, gfp_t gfp);
@@ -858,8 +858,7 @@ static inline int security_file_receive(struct file *file)
return 0;
}

-static inline int security_file_open(struct file *file,
- const struct cred *cred)
+static inline int security_file_open(struct file *file)
{
return 0;
}
diff --git a/security/security.c b/security/security.c
index 68f46d849abe..235b35f58a65 100644
--- a/security/security.c
+++ b/security/security.c
@@ -970,11 +970,11 @@ int security_file_receive(struct file *file)
return call_int_hook(file_receive, 0, file);
}

-int security_file_open(struct file *file, const struct cred *cred)
+int security_file_open(struct file *file)
{
int ret;

- ret = call_int_hook(file_open, 0, file, cred);
+ ret = call_int_hook(file_open, 0, file, file->f_cred);
if (ret)
return ret;

--
2.11.0


2018-07-11 02:28:27

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 16/42] now we can fold open_check_o_direct() into do_dentry_open()

From: Al Viro <[email protected]>

Signed-off-by: Al Viro <[email protected]>
---
fs/internal.h | 1 -
fs/namei.c | 7 +------
fs/open.c | 17 +++++------------
3 files changed, 6 insertions(+), 19 deletions(-)

diff --git a/fs/internal.h b/fs/internal.h
index ab84a29f4874..33a28438570e 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -125,7 +125,6 @@ int do_fchmodat(int dfd, const char __user *filename, umode_t mode);
int do_fchownat(int dfd, const char __user *filename, uid_t user, gid_t group,
int flag);

-extern int open_check_o_direct(struct file *f);
extern int vfs_open(const struct path *, struct file *);

/*
diff --git a/fs/namei.c b/fs/namei.c
index 5a79bd170410..39b5da70d6f5 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3401,9 +3401,7 @@ static int do_last(struct nameidata *nd,
goto out;
*opened |= FILE_OPENED;
opened:
- error = open_check_o_direct(file);
- if (!error)
- error = ima_file_check(file, op->acc_mode, *opened);
+ error = ima_file_check(file, op->acc_mode, *opened);
if (!error && will_truncate)
error = handle_truncate(file);
out:
@@ -3479,9 +3477,6 @@ static int do_tmpfile(struct nameidata *nd, unsigned flags,
goto out2;
file->f_path.mnt = path.mnt;
error = finish_open(file, child, NULL, opened);
- if (error)
- goto out2;
- error = open_check_o_direct(file);
out2:
mnt_drop_write(path.mnt);
out:
diff --git a/fs/open.c b/fs/open.c
index affdeebe5fd5..2aa3dc03e2fd 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -724,16 +724,6 @@ SYSCALL_DEFINE3(fchown, unsigned int, fd, uid_t, user, gid_t, group)
return ksys_fchown(fd, user, group);
}

-int open_check_o_direct(struct file *f)
-{
- /* NB: we're sure to have correct a_ops only after f_op->open */
- if (f->f_flags & O_DIRECT) {
- if (!f->f_mapping->a_ops || !f->f_mapping->a_ops->direct_IO)
- return -EINVAL;
- }
- return 0;
-}
-
static int do_dentry_open(struct file *f,
struct inode *inode,
int (*open)(struct inode *, struct file *))
@@ -809,6 +799,11 @@ static int do_dentry_open(struct file *f,

file_ra_state_init(&f->f_ra, f->f_mapping->host->i_mapping);

+ /* NB: we're sure to have correct a_ops only after f_op->open */
+ if (f->f_flags & O_DIRECT) {
+ if (!f->f_mapping->a_ops || !f->f_mapping->a_ops->direct_IO)
+ return -EINVAL;
+ }
return 0;

cleanup_all:
@@ -925,8 +920,6 @@ struct file *dentry_open(const struct path *path, int flags,
if (!IS_ERR(f)) {
f->f_flags = flags;
error = vfs_open(path, f);
- if (!error)
- error = open_check_o_direct(f);
if (error) {
fput(f);
f = ERR_PTR(error);
--
2.11.0


2018-07-11 02:28:34

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 04/42] make get_empty_filp() to call file_free_rcu() directly

From: Al Viro <[email protected]>

no point in rcu-delays here

Signed-off-by: Al Viro <[email protected]>
---
fs/file_table.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index 7ec0b3e5f05d..1f14b80a4e67 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -123,11 +123,10 @@ struct file *get_empty_filp(void)
if (unlikely(!f))
return ERR_PTR(-ENOMEM);

- percpu_counter_inc(&nr_files);
f->f_cred = get_cred(cred);
error = security_file_alloc(f);
if (unlikely(error)) {
- file_free(f);
+ file_free_rcu(&f->f_u.fu_rcuhead);
return ERR_PTR(error);
}

@@ -137,6 +136,7 @@ struct file *get_empty_filp(void)
mutex_init(&f->f_pos_lock);
eventpoll_init_file(f);
/* f->f_version: 0 */
+ percpu_counter_inc(&nr_files);
return f;

over:
--
2.11.0


2018-07-11 02:28:59

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 02/42] cxl_getfile(): fix double-iput() on alloc_file() failures

From: Al Viro <[email protected]>

Doing iput() after path_put() is wrong.

Cc: [email protected]
Signed-off-by: Al Viro <[email protected]>
---
drivers/misc/cxl/api.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c
index 753b1a698fc4..6b16946f9b05 100644
--- a/drivers/misc/cxl/api.c
+++ b/drivers/misc/cxl/api.c
@@ -103,15 +103,15 @@ static struct file *cxl_getfile(const char *name,
d_instantiate(path.dentry, inode);

file = alloc_file(&path, OPEN_FMODE(flags), fops);
- if (IS_ERR(file))
- goto err_dput;
+ if (IS_ERR(file)) {
+ path_put(&path);
+ goto err_fs;
+ }
file->f_flags = flags & (O_ACCMODE | O_NONBLOCK);
file->private_data = priv;

return file;

-err_dput:
- path_put(&path);
err_inode:
iput(inode);
err_fs:
--
2.11.0


2018-07-11 02:28:59

by Al Viro

[permalink] [raw]
Subject: [RFC][PATCH 05/42] fold security_file_free() into file_free()

From: Al Viro <[email protected]>

Signed-off-by: Al Viro <[email protected]>
---
fs/file_table.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index 1f14b80a4e67..eee7cf629e52 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -51,6 +51,7 @@ static void file_free_rcu(struct rcu_head *head)

static inline void file_free(struct file *f)
{
+ security_file_free(f);
percpu_counter_dec(&nr_files);
call_rcu(&f->f_u.fu_rcuhead, file_free_rcu);
}
@@ -207,7 +208,6 @@ static void __fput(struct file *file)
}
if (file->f_op->release)
file->f_op->release(inode, file);
- security_file_free(file);
if (unlikely(S_ISCHR(inode->i_mode) && inode->i_cdev != NULL &&
!(file->f_mode & FMODE_PATH))) {
cdev_put(inode->i_cdev);
@@ -302,10 +302,8 @@ EXPORT_SYMBOL(fput);

void put_filp(struct file *file)
{
- if (atomic_long_dec_and_test(&file->f_count)) {
- security_file_free(file);
+ if (atomic_long_dec_and_test(&file->f_count))
file_free(file);
- }
}

void __init files_init(void)
--
2.11.0


2018-07-11 02:37:12

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC][PATCH 04/42] make get_empty_filp() to call file_free_rcu() directly

This one looked odd to me.

Then I saw 5/42, and it made more sense.

I think the explanation is a bit misleading. Technically correct, but
not *why* you did it.

Linus

2018-07-11 02:40:11

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC][PATCH 08/42] make sure do_dentry_open() won't return positive as an error

On Tue, Jul 10, 2018 at 7:23 PM Al Viro <[email protected]> wrote:
>
>
> cleanup_all:
> + if (unlikely(error > 0)) {
> + WARN_ON(1);
> + error = -EINVAL;
> + }

Can we please do this as

if (WARN_ON_ONCE(error > 0))
error = -EINVAL;


instead?

That already should do the unlikely for you, and *if* somebody can
trigger this, at least they won't be spamming the logs and making a
mess. A single warning is fine.

Linus

2018-07-11 02:43:49

by Al Viro

[permalink] [raw]
Subject: Re: [RFC][PATCH 08/42] make sure do_dentry_open() won't return positive as an error

On Tue, Jul 10, 2018 at 07:39:07PM -0700, Linus Torvalds wrote:
> On Tue, Jul 10, 2018 at 7:23 PM Al Viro <[email protected]> wrote:
> >
> >
> > cleanup_all:
> > + if (unlikely(error > 0)) {
> > + WARN_ON(1);
> > + error = -EINVAL;
> > + }
>
> Can we please do this as
>
> if (WARN_ON_ONCE(error > 0))
> error = -EINVAL;
>
>
> instead?
>
> That already should do the unlikely for you, and *if* somebody can
> trigger this, at least they won't be spamming the logs and making a
> mess. A single warning is fine.

No problem, will do...

2018-07-11 02:44:40

by Al Viro

[permalink] [raw]
Subject: Re: [RFC][PATCH 04/42] make get_empty_filp() to call file_free_rcu() directly

On Tue, Jul 10, 2018 at 07:35:23PM -0700, Linus Torvalds wrote:
> This one looked odd to me.
>
> Then I saw 5/42, and it made more sense.
>
> I think the explanation is a bit misleading. Technically correct, but
> not *why* you did it.

Probably should fold these two together, now that reordering has brought
them next to each other (originally there used to be several pieces in
between)...

2018-07-11 02:46:00

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC][PATCH 16/42] now we can fold open_check_o_direct() into do_dentry_open()

I like the patch, I hate the commit message.

It makes sense right now in this sequence, but I'd really like the
commit message to say _why_ this sequence led up to this point.

Right now I still remember you trying this, and having to revert it
because it didn't work before all the fput/put_filp issues. But a year
from now? Five years from now?

So at least a "now that fput() works regardless of how far the open
got.." kind of explanation, ok?

Linus

2018-07-11 02:58:04

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC][PATCH 01/42] drm_mode_create_lease_ioctl(): fix open-coded filp_clone_open()

Ok, you didn't seem to have a coverletter email, so I'm just replying
to the first one.

Apart from the couple of totally trivial things I reacted to, this
looks very clean and nice. And now I sat in front of the computer
while reading it, so I could follow along better.

So apart from the small stylistic nits, all Acked-by from me.

Linus

2018-07-11 03:00:52

by Al Viro

[permalink] [raw]
Subject: Re: [RFC][PATCH 16/42] now we can fold open_check_o_direct() into do_dentry_open()

On Tue, Jul 10, 2018 at 07:44:59PM -0700, Linus Torvalds wrote:
> I like the patch, I hate the commit message.
>
> It makes sense right now in this sequence, but I'd really like the
> commit message to say _why_ this sequence led up to this point.
>
> Right now I still remember you trying this, and having to revert it
> because it didn't work before all the fput/put_filp issues. But a year
> from now? Five years from now?
>
> So at least a "now that fput() works regardless of how far the open
> got.." kind of explanation, ok?

Umm... Something like

These checks are better off in do_dentry_open(); the reason we couldn't
put them there used to be that callers couldn't tell what kind of cleanup
would do_dentry_open() failure call for. Now that we have FMODE_OPENED,
cleanup is the same in all cases - it's simply fput(). So let's fold
that into do_dentry_open(), as Christoph's patch tried to.

perhaps?

2018-07-11 03:14:54

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC][PATCH 16/42] now we can fold open_check_o_direct() into do_dentry_open()

On Tue, Jul 10, 2018 at 7:59 PM Al Viro <[email protected]> wrote:
>
> Umm... Something like [..]

Ack.

Linus

2018-07-11 05:44:48

by Amir Goldstein

[permalink] [raw]
Subject: Re: [RFC][PATCH 15/42] lift fput() on late failures into path_openat()

On Wed, Jul 11, 2018 at 5:21 AM, Al Viro <[email protected]> wrote:
[...]
> @@ -3407,8 +3407,6 @@ static int do_last(struct nameidata *nd,
> if (!error && will_truncate)
> error = handle_truncate(file);
> out:
> - if (unlikely(error) && (*opened & FILE_OPENED))
> - fput(file);
> if (unlikely(error > 0)) {
> WARN_ON(1);

Another opportunity to squash WARN_ON(1) with condition.

> error = -EINVAL;
> @@ -3484,8 +3482,6 @@ static int do_tmpfile(struct nameidata *nd, unsigned flags,
> if (error)
> goto out2;
> error = open_check_o_direct(file);
> - if (error)
> - fput(file);
> out2:
> mnt_drop_write(path.mnt);
> out:
> @@ -3547,20 +3543,20 @@ static struct file *path_openat(struct nameidata *nd,
> }
> terminate_walk(nd);
> out2:
> - if (!(opened & FILE_OPENED)) {
> - BUG_ON(!error);
> - fput(file);
> + if (likely(!error)) {
> + if (likely(opened & FILE_OPENED))
> + return file;
> + WARN_ON(1);

This style may be open for debate but:

if (!WARN_ON(!(opened & FILE_OPENED)))
return file;

Maybe we need WARN_UNLESS() or WARN_ASSERT()

Thanks,
Amir.

2018-07-11 19:10:52

by Al Viro

[permalink] [raw]
Subject: Re: [RFC][PATCH 01/42] drm_mode_create_lease_ioctl(): fix open-coded filp_clone_open()

On Wed, Jul 11, 2018 at 04:25:55PM +0100, Al Viro wrote:

> FWIW, looking at the ->f_flags handling, I'm seriously tempted to do
> alloc_file_pseudo(inode, mnt, name, f_flags, ops).
>
> Reason: right now all but two callers of alloc_file_pseudo() are followed
> by setting ->f_flags and for all those callers the mode argument passed
> to alloc_file_pseudo() is equal to OPEN_FMODE(->f_flags value to be).
>
> The exceptions are __shmem_file_setup() and hugetlb_file_setup(). Both
> end up with FMODE_READ | FMODE_WRITE combined with 0 (i.e. O_RDONLY) in
> f_flags. Which smells like a bug in making, at the very least.
>
> Unless somebody has a good reason why those shouldn't have O_RDWR,
> I want to add (and fold back into individual "convert to alloc_file_pseudo"
> commits) the following:

[snip]

> Objections? Another odd beastie is do_shmat() - there we end up with
> f_mode not matching f_flags, same way as in shmem and hugetlb. If we
> could rectify that one as well, we'd be able to switch alloc_file_clone()
> to flags as well. I would obviously prefer that kind of change to happen
> before these helpers go into mainline...

Actually, looking at the entire thing, I'm rather tempted to go for
alloc_empty_file(f_flags, cred)
setting both f_flags and f_flags-derived part of f_mode, making
alloc_file_pseudo(inode, mnt, name, f_flags, ops)
alloc_file_clone(base, f_flags, ops)
do the same automatically as they call alloc_empty_file().

do_dentry_open() would, instead of
f->f_mode |= OPEN_FMODE(f->f_flags) | FMODE_LSEEK |
FMODE_PREAD | FMODE_PWRITE;
do just
f->f_mode |= FMODE_LSEEK | FMODE_PREAD | FMODE_PWRITE;
just before calling ->open(), with comment along the lines of
"usually we want those set; let ->open() remove the wrong ones
if it wants to".

2018-07-11 21:03:09

by Al Viro

[permalink] [raw]
Subject: Re: [RFC][PATCH 01/42] drm_mode_create_lease_ioctl(): fix open-coded filp_clone_open()

On Tue, Jul 10, 2018 at 07:56:51PM -0700, Linus Torvalds wrote:
> Ok, you didn't seem to have a coverletter email, so I'm just replying
> to the first one.
>
> Apart from the couple of totally trivial things I reacted to, this
> looks very clean and nice. And now I sat in front of the computer
> while reading it, so I could follow along better.
>
> So apart from the small stylistic nits, all Acked-by from me.

FWIW, looking at the ->f_flags handling, I'm seriously tempted to do
alloc_file_pseudo(inode, mnt, name, f_flags, ops).

Reason: right now all but two callers of alloc_file_pseudo() are followed
by setting ->f_flags and for all those callers the mode argument passed
to alloc_file_pseudo() is equal to OPEN_FMODE(->f_flags value to be).

The exceptions are __shmem_file_setup() and hugetlb_file_setup(). Both
end up with FMODE_READ | FMODE_WRITE combined with 0 (i.e. O_RDONLY) in
f_flags. Which smells like a bug in making, at the very least.

Unless somebody has a good reason why those shouldn't have O_RDWR,
I want to add (and fold back into individual "convert to alloc_file_pseudo"
commits) the following:

diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting
index 77e2d623e115..6b8f5e37f0f7 100644
--- a/Documentation/filesystems/porting
+++ b/Documentation/filesystems/porting
@@ -613,7 +613,7 @@ in your dentry operations instead.
--
[mandatory]
alloc_file() has become static now; two wrappers are to be used instead.
- alloc_file_pseudo(inode, vfsmount, name, mode, ops) is for the cases
+ alloc_file_pseudo(inode, vfsmount, name, flags, ops) is for the cases
when dentry needs to be created; that's the majority of old alloc_file()
users. Calling conventions: on success a reference to new struct file
is returned and callers reference to inode is subsumed by that. On
diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c
index e0b9c00aecde..8fd5ec4d6042 100644
--- a/drivers/misc/cxl/api.c
+++ b/drivers/misc/cxl/api.c
@@ -90,11 +90,10 @@ static struct file *cxl_getfile(const char *name,
}

file = alloc_file_pseudo(inode, cxl_vfs_mount, name,
- OPEN_FMODE(flags), fops);
+ flags & (O_ACCMODE | O_NONBLOCK), fops);
if (IS_ERR(file))
goto err_inode;

- file->f_flags = flags & (O_ACCMODE | O_NONBLOCK);
file->private_data = priv;

return file;
diff --git a/drivers/scsi/cxlflash/ocxl_hw.c b/drivers/scsi/cxlflash/ocxl_hw.c
index 6d0632174ec6..a43d44e7e7dd 100644
--- a/drivers/scsi/cxlflash/ocxl_hw.c
+++ b/drivers/scsi/cxlflash/ocxl_hw.c
@@ -115,7 +115,7 @@ static struct file *ocxlflash_getfile(struct device *dev, const char *name,
}

file = alloc_file_pseudo(inode, ocxlflash_vfs_mount, name,
- OPEN_FMODE(flags), fops);
+ flags & (O_ACCMODE | O_NONBLOCK), fops);
if (IS_ERR(file)) {
rc = PTR_ERR(file);
dev_err(dev, "%s: alloc_file failed rc=%d\n",
@@ -123,7 +123,6 @@ static struct file *ocxlflash_getfile(struct device *dev, const char *name,
goto err4;
}

- file->f_flags = flags & (O_ACCMODE | O_NONBLOCK);
file->private_data = priv;
out:
return file;
diff --git a/fs/aio.c b/fs/aio.c
index c5244c68f90e..c3a8bac16374 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -225,11 +225,9 @@ static struct file *aio_private_file(struct kioctx *ctx, loff_t nr_pages)
inode->i_size = PAGE_SIZE * nr_pages;

file = alloc_file_pseudo(inode, aio_mnt, "[aio]",
- FMODE_READ | FMODE_WRITE, &aio_ring_fops);
+ O_RDWR, &aio_ring_fops);
if (IS_ERR(file))
iput(inode);
- else
- file->f_flags = O_RDWR;
return file;
}

diff --git a/fs/anon_inodes.c b/fs/anon_inodes.c
index 7e13edd23db1..9a3c765fc7b1 100644
--- a/fs/anon_inodes.c
+++ b/fs/anon_inodes.c
@@ -85,13 +85,11 @@ struct file *anon_inode_getfile(const char *name,
*/
ihold(anon_inode_inode);
file = alloc_file_pseudo(anon_inode_inode, anon_inode_mnt, name,
- OPEN_FMODE(flags), fops);
+ flags & (O_ACCMODE | O_NONBLOCK), fops);
if (IS_ERR(file))
goto err;

file->f_mapping = anon_inode_inode->i_mapping;
-
- file->f_flags = flags & (O_ACCMODE | O_NONBLOCK);
file->private_data = priv;

return file;
diff --git a/fs/file_table.c b/fs/file_table.c
index c5f651fd6830..af9141d8e29f 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -182,7 +182,7 @@ static struct file *alloc_file(const struct path *path, fmode_t mode,
}

struct file *alloc_file_pseudo(struct inode *inode, struct vfsmount *mnt,
- const char *name, fmode_t mode,
+ const char *name, int flags,
const struct file_operations *fops)
{
static const struct dentry_operations anon_ops = {
@@ -199,11 +199,12 @@ struct file *alloc_file_pseudo(struct inode *inode, struct vfsmount *mnt,
d_set_d_op(path.dentry, &anon_ops);
path.mnt = mntget(mnt);
d_instantiate(path.dentry, inode);
- file = alloc_file(&path, mode, fops);
+ file = alloc_file(&path, OPEN_FMODE(flags), fops);
if (IS_ERR(file)) {
ihold(inode);
path_put(&path);
}
+ file->f_flags = flags;
return file;
}
EXPORT_SYMBOL(alloc_file_pseudo);
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 86ffe04f73d6..87605c73361b 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -1358,8 +1358,7 @@ struct file *hugetlb_file_setup(const char *name, size_t size,
acctflag))
file = ERR_PTR(-ENOMEM);
else
- file = alloc_file_pseudo(inode, mnt, name,
- FMODE_WRITE | FMODE_READ,
+ file = alloc_file_pseudo(inode, mnt, name, O_RDWR,
&hugetlbfs_file_operations);
if (!IS_ERR(file))
return file;
diff --git a/fs/pipe.c b/fs/pipe.c
index 94323a1d7c92..f5d30579e017 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -750,14 +750,15 @@ int create_pipe_files(struct file **res, int flags)
if (!inode)
return -ENFILE;

- f = alloc_file_pseudo(inode, pipe_mnt, "", FMODE_WRITE, &pipefifo_fops);
+ f = alloc_file_pseudo(inode, pipe_mnt, "",
+ O_WRONLY | (flags & (O_NONBLOCK | O_DIRECT)),
+ &pipefifo_fops);
if (IS_ERR(f)) {
free_pipe_info(inode->i_pipe);
iput(inode);
return PTR_ERR(f);
}

- f->f_flags = O_WRONLY | (flags & (O_NONBLOCK | O_DIRECT));
f->private_data = inode->i_pipe;

res[0] = alloc_file_clone(f, FMODE_READ, &pipefifo_fops);
diff --git a/include/linux/file.h b/include/linux/file.h
index 325b36ca336d..1972776e1677 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -20,7 +20,7 @@ struct dentry;
struct inode;
struct path;
extern struct file *alloc_file_pseudo(struct inode *, struct vfsmount *,
- const char *, fmode_t, const struct file_operations *);
+ const char *, int, const struct file_operations *);
extern struct file *alloc_file_clone(struct file *, fmode_t,
const struct file_operations *);

diff --git a/mm/shmem.c b/mm/shmem.c
index fd21df189f32..d7e984141be5 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3922,8 +3922,7 @@ static struct file *__shmem_file_setup(struct vfsmount *mnt, const char *name, l
clear_nlink(inode); /* It is unlinked */
res = ERR_PTR(ramfs_nommu_expand_for_mapping(inode, size));
if (!IS_ERR(res))
- res = alloc_file_pseudo(inode, mnt, name,
- FMODE_WRITE | FMODE_READ,
+ res = alloc_file_pseudo(inode, mnt, name, O_RDWR,
&shmem_file_operations);
if (IS_ERR(res))
iput(inode);
diff --git a/net/socket.c b/net/socket.c
index 81cf9906cae5..4cf3568caf9f 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -397,14 +397,14 @@ struct file *sock_alloc_file(struct socket *sock, int flags, const char *dname)
dname = sock->sk ? sock->sk->sk_prot_creator->name : "";

file = alloc_file_pseudo(SOCK_INODE(sock), sock_mnt, dname,
- FMODE_READ | FMODE_WRITE, &socket_file_ops);
+ O_RDWR | (flags & O_NONBLOCK),
+ &socket_file_ops);
if (IS_ERR(file)) {
sock_release(sock);
return file;
}

sock->file = file;
- file->f_flags = O_RDWR | (flags & O_NONBLOCK);
file->private_data = sock;
return file;
}


Objections? Another odd beastie is do_shmat() - there we end up with
f_mode not matching f_flags, same way as in shmem and hugetlb. If we
could rectify that one as well, we'd be able to switch alloc_file_clone()
to flags as well. I would obviously prefer that kind of change to happen
before these helpers go into mainline...

2018-07-12 12:44:56

by Al Viro

[permalink] [raw]
Subject: Re: [RFC][PATCH 01/42] drm_mode_create_lease_ioctl(): fix open-coded filp_clone_open()

On Wed, Jul 11, 2018 at 05:15:40PM +0100, Al Viro wrote:

> Actually, looking at the entire thing, I'm rather tempted to go for
> alloc_empty_file(f_flags, cred)
> setting both f_flags and f_flags-derived part of f_mode, making
> alloc_file_pseudo(inode, mnt, name, f_flags, ops)
> alloc_file_clone(base, f_flags, ops)
> do the same automatically as they call alloc_empty_file().

See vfs.git#work.open3; one commit added just before "pass creds
to get_empty_filp(), make sure dentry_open() passes the right creds",
one just after.

The first one ("alloc_file(): switch to passing O_... flags instead of
FMODE_... mode") pulls ->f_flags assignments into preceding alloc_file(),
the second ("pass ->f_flags value to alloc_empty_file()") makes
alloc_empty_file() set ->f_flags and ->f_flags-derived part of ->f_mode,
with do_dentry_open() leaving these parts of ->f_mode alone.

The rest is pretty much identical to the previous iteration of the
series, except that alloc_file wrappers inherit the switch to passing
O_...

A question regarding the customs in such situations - are previous
Reviewed-by/Acked-by normally kept across rebases like that?

2018-07-12 15:07:21

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC][PATCH 01/42] drm_mode_create_lease_ioctl(): fix open-coded filp_clone_open()

On Thu, Jul 12, 2018 at 5:43 AM Al Viro <[email protected]> wrote:
>
> A question regarding the customs in such situations - are previous
> Reviewed-by/Acked-by normally kept across rebases like that?

Yeah, unless there were big changes, keep the reviewed/acked-by lines.

Otherwise you'd never be able to handle different people giving
slightly different feedback about separate issues.

Linus

2018-07-12 15:55:15

by Al Viro

[permalink] [raw]
Subject: vfs / overlayfs conflict resolution for linux-next

On Thu, Jul 12, 2018 at 08:05:26AM -0700, Linus Torvalds wrote:
> On Thu, Jul 12, 2018 at 5:43 AM Al Viro <[email protected]> wrote:
> >
> > A question regarding the customs in such situations - are previous
> > Reviewed-by/Acked-by normally kept across rebases like that?
>
> Yeah, unless there were big changes, keep the reviewed/acked-by lines.
>
> Otherwise you'd never be able to handle different people giving
> slightly different feedback about separate issues.

OK... Miklos, I've pushed #ovl-candidate, with equivalent of the beginning
of your branch. I'm *not* saying that I've no remaining issues
with your series - this is just how I'd prefer to resolve that group
of conflicts.

Everything past "vfs: simplify dentry_open()" could live on top of that
one, or its equivalent.

I'm going to put #work-open3 into -next, let's figure out what to do with
the conflicts; what I can promise is never-rebased status for #for-ovl
(the beginning of #work-open3 merged into #ovl-candidate).

2018-07-18 02:57:31

by Al Viro

[permalink] [raw]
Subject: Re: vfs / overlayfs conflict resolution for linux-next

On Thu, Jul 12, 2018 at 04:53:37PM +0100, Al Viro wrote:
> On Thu, Jul 12, 2018 at 08:05:26AM -0700, Linus Torvalds wrote:
> > On Thu, Jul 12, 2018 at 5:43 AM Al Viro <[email protected]> wrote:
> > >
> > > A question regarding the customs in such situations - are previous
> > > Reviewed-by/Acked-by normally kept across rebases like that?
> >
> > Yeah, unless there were big changes, keep the reviewed/acked-by lines.
> >
> > Otherwise you'd never be able to handle different people giving
> > slightly different feedback about separate issues.
>
> OK... Miklos, I've pushed #ovl-candidate, with equivalent of the beginning
> of your branch. I'm *not* saying that I've no remaining issues
> with your series - this is just how I'd prefer to resolve that group
> of conflicts.
>
> Everything past "vfs: simplify dentry_open()" could live on top of that
> one, or its equivalent.
>
> I'm going to put #work-open3 into -next, let's figure out what to do with
> the conflicts; what I can promise is never-rebased status for #for-ovl
> (the beginning of #work-open3 merged into #ovl-candidate).

... and now it even builds. Said that, I would really like to hear something
from you - I can duplicate the entire overlayfs-next and merge it into
my #for-next and ask Steven to use that instead of your tree, but I very
much dislike going over your head like that.

I realize that you'd been away for a while and probably are digging yourself
from under the piles of mail, but it's getting late in the cycle and I want
to get #for-next into reasonably sane shape. Please, look through that
thing and respond.

2018-07-18 03:30:56

by Stephen Rothwell

[permalink] [raw]
Subject: Re: vfs / overlayfs conflict resolution for linux-next

Hi Al,

On Wed, 18 Jul 2018 03:56:37 +0100 Al Viro <[email protected]> wrote:
>
> ... and now it even builds. Said that, I would really like to hear something
> from you - I can duplicate the entire overlayfs-next and merge it into
> my #for-next and ask Steven to use that instead of your tree, but I very
> much dislike going over your head like that.
>
> I realize that you'd been away for a while and probably are digging yourself
> from under the piles of mail, but it's getting late in the cycle and I want
> to get #for-next into reasonably sane shape. Please, look through that
> thing and respond.

Almost everything has been removed from the overlayfs tree in
linux-next today. The only commit there currently is:

67810693077a ovl: fix wrong use of impure dir cache in ovl_iterate()
--
Cheers,
Stephen Rothwell


Attachments:
(No filename) (499.00 B)
OpenPGP digital signature

2018-07-18 07:27:48

by Miklos Szeredi

[permalink] [raw]
Subject: Re: vfs / overlayfs conflict resolution for linux-next

On Wed, Jul 18, 2018 at 5:29 AM, Stephen Rothwell <[email protected]> wrote:
> Hi Al,
>
> On Wed, 18 Jul 2018 03:56:37 +0100 Al Viro <[email protected]> wrote:
>>
>> ... and now it even builds. Said that, I would really like to hear something
>> from you - I can duplicate the entire overlayfs-next and merge it into
>> my #for-next and ask Steven to use that instead of your tree, but I very
>> much dislike going over your head like that.
>>
>> I realize that you'd been away for a while and probably are digging yourself
>> from under the piles of mail, but it's getting late in the cycle and I want
>> to get #for-next into reasonably sane shape. Please, look through that
>> thing and respond.
>
> Almost everything has been removed from the overlayfs tree in
> linux-next today. The only commit there currently is:
>
> 67810693077a ovl: fix wrong use of impure dir cache in ovl_iterate()

Al, thank you very much for taking care of this. I've already begone
to go through those and will finish up the merge, hopefully today.

Thanks,
Miklos

2018-07-18 12:11:46

by Miklos Szeredi

[permalink] [raw]
Subject: Re: vfs / overlayfs conflict resolution for linux-next

On Wed, Jul 18, 2018 at 9:25 AM, Miklos Szeredi <[email protected]> wrote:
> On Wed, Jul 18, 2018 at 5:29 AM, Stephen Rothwell <[email protected]> wrote:
>> Hi Al,
>>
>> On Wed, 18 Jul 2018 03:56:37 +0100 Al Viro <[email protected]> wrote:
>>>
>>> ... and now it even builds. Said that, I would really like to hear something
>>> from you - I can duplicate the entire overlayfs-next and merge it into
>>> my #for-next and ask Steven to use that instead of your tree, but I very
>>> much dislike going over your head like that.
>>>
>>> I realize that you'd been away for a while and probably are digging yourself
>>> from under the piles of mail, but it's getting late in the cycle and I want
>>> to get #for-next into reasonably sane shape. Please, look through that
>>> thing and respond.

In "ovl: stack file ops" this:

AV: make it use open_with_fake_path(), don't mess with override_creds

Maybe it's the way to go, but looks broken as is; e.g. NFS does call
current_creds() from its open method to get the credentials to work
with.

Okay, so ->open() is a file op, and file ops should use file->f_cred,
but how are we going to enforce this?

Thanks,
Miklos

2018-07-18 12:44:42

by Al Viro

[permalink] [raw]
Subject: Re: vfs / overlayfs conflict resolution for linux-next

On Wed, Jul 18, 2018 at 02:10:32PM +0200, Miklos Szeredi wrote:
> On Wed, Jul 18, 2018 at 9:25 AM, Miklos Szeredi <[email protected]> wrote:
> > On Wed, Jul 18, 2018 at 5:29 AM, Stephen Rothwell <[email protected]> wrote:
> >> Hi Al,
> >>
> >> On Wed, 18 Jul 2018 03:56:37 +0100 Al Viro <[email protected]> wrote:
> >>>
> >>> ... and now it even builds. Said that, I would really like to hear something
> >>> from you - I can duplicate the entire overlayfs-next and merge it into
> >>> my #for-next and ask Steven to use that instead of your tree, but I very
> >>> much dislike going over your head like that.
> >>>
> >>> I realize that you'd been away for a while and probably are digging yourself
> >>> from under the piles of mail, but it's getting late in the cycle and I want
> >>> to get #for-next into reasonably sane shape. Please, look through that
> >>> thing and respond.
>
> In "ovl: stack file ops" this:
>
> AV: make it use open_with_fake_path(), don't mess with override_creds
>
> Maybe it's the way to go, but looks broken as is; e.g. NFS does call
> current_creds() from its open method to get the credentials to work
> with.

It *is* broken. For now leave override_creds() as in your variant, but
we really want to deal with that crap eventually.

> Okay, so ->open() is a file op, and file ops should use file->f_cred,
> but how are we going to enforce this?

I'd say we cut down on the use of current_cred() when deep in call chain...

2018-07-18 13:47:29

by Al Viro

[permalink] [raw]
Subject: Re: vfs / overlayfs conflict resolution for linux-next

On Wed, Jul 18, 2018 at 01:43:40PM +0100, Al Viro wrote:

> It *is* broken. For now leave override_creds() as in your variant, but
> we really want to deal with that crap eventually.
>
> > Okay, so ->open() is a file op, and file ops should use file->f_cred,
> > but how are we going to enforce this?
>
> I'd say we cut down on the use of current_cred() when deep in call chain...

Actually, how about this:

#define call_with_creds(__cred, expr) ({ \
__typeof__(expr) ____res; \
const struct cred *____old = current->cred; \
const struct cred *____new = (__cred); \
rcu_assign_pointer(current->cred, ____new); \
____res = expr; \
BUG_ON(current->cred != ____new); \
rcu_assign_pointer(current->cred, ____old); \
____res; \
})

and replacing
error = open(inode, f);
with
error = call_with_cred(f->f_cred, open(inode, f));

possibly with similar at other methods callsites? Unlike override_creds()
and revert_creds() it's cheap - no validation of creds, no cacheline
ping-pong...

Folks?

2018-07-18 15:46:59

by Miklos Szeredi

[permalink] [raw]
Subject: Re: vfs / overlayfs conflict resolution for linux-next

On Wed, Jul 18, 2018 at 2:43 PM, Al Viro <[email protected]> wrote:
> On Wed, Jul 18, 2018 at 02:10:32PM +0200, Miklos Szeredi wrote:
>> On Wed, Jul 18, 2018 at 9:25 AM, Miklos Szeredi <[email protected]> wrote:
>> > On Wed, Jul 18, 2018 at 5:29 AM, Stephen Rothwell <[email protected]> wrote:
>> >> Hi Al,
>> >>
>> >> On Wed, 18 Jul 2018 03:56:37 +0100 Al Viro <[email protected]> wrote:
>> >>>
>> >>> ... and now it even builds. Said that, I would really like to hear something
>> >>> from you - I can duplicate the entire overlayfs-next and merge it into
>> >>> my #for-next and ask Steven to use that instead of your tree, but I very
>> >>> much dislike going over your head like that.
>> >>>
>> >>> I realize that you'd been away for a while and probably are digging yourself
>> >>> from under the piles of mail, but it's getting late in the cycle and I want
>> >>> to get #for-next into reasonably sane shape. Please, look through that
>> >>> thing and respond.

Pushed updated series based on your vfs.git#for-ovl branch to the
overlayfs-next tree. There's the additional patch dealing with
nr_files accounting (will post for review shortly). That one has a
trivial conflict with the mount series, otherwise merges cleanly with
viro/vfs.git#for-next.

I like the call_with_creds() idea. I didn't realize that
override_creds()/revert_creds() can be quite heavyweight due to doing
(unnecessary in this case) refcounting. Could use call_with_creds()
in overlayfs too, since we hold ref on ofs->creator_cred for the
lifetime of the filesystem.

Thanks,
Miklos

2018-07-18 18:13:48

by Al Viro

[permalink] [raw]
Subject: [RFC] call_with_creds()

On Wed, Jul 18, 2018 at 05:46:09PM +0200, Miklos Szeredi wrote:

> I like the call_with_creds() idea. I didn't realize that
> override_creds()/revert_creds() can be quite heavyweight due to doing
> (unnecessary in this case) refcounting. Could use call_with_creds()
> in overlayfs too, since we hold ref on ofs->creator_cred for the
> lifetime of the filesystem.

OK...

/* expr is non-void here */
#define __call_with_creds(__cred, expr) ({ \
const struct cred *____old = current->cred; \
const struct cred *____new = __cred; \
__typeof__(expr) __res; \
rcu_assign_pointer(current->cred, ____new); \
__res = (expr); \
BUG_ON(current->cred != ____new); \
rcu_assign_pointer(current->cred, ____old); \
__res; })

/* expr for non-void, expr,0 for void */
#define __wrap_void(expr) __builtin_choose_expr( \
__builtin_types_compatible_p(__typeof__(expr), void), \
((expr),0), (expr))

/*
* Evaluate an expression with current->cred temporary set to __cred.
* NOTE: expr must not result in changed ->cred - any changes during
* its evaluation must be undone.
*/
#define call_with_creds(__cred, expr) \
((__typeof__(expr))__call_with_creds(__cred, __wrap_void(expr)))

Linus, David - do you have any objections to the above? I would like
to do some of the file method calls via that - e.g. have callers of
->write() (both __vfs_write() and do_loop_readv_writev()) use
nr = call_with_creds(file->f_cred,
file->f_op->write(file, iovec.iov_base,
iovec.iov_len, ppos));
instead of
nr = file->f_op->write(file, iovec.iov_base,
iovec.iov_len, ppos));
Ditto for call of open() in do_dentry_open(), etc.

2018-07-18 18:20:23

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC] call_with_creds()

On Wed, Jul 18, 2018 at 11:13 AM Al Viro <[email protected]> wrote:
>
> Linus, David - do you have any objections to the above?

I damn well do.

I explained earlier why it's wrong and fragile, and why it can just
cause the *reverse* security problem if you do it wrong. So now you
take a subtle bug, and make it even more subtle, and encourage people
to do this known-broken model of using creds at IO time.

No.

Some debugging option to just clear current->creds entirely and catch
mis-uses, sure. But saying "we have shit buggy garbage in random write
functions, so we'll just paper over it"? No.

Linus

2018-07-18 19:49:32

by Al Viro

[permalink] [raw]
Subject: Re: [RFC] call_with_creds()

On Wed, Jul 18, 2018 at 11:19:18AM -0700, Linus Torvalds wrote:
> On Wed, Jul 18, 2018 at 11:13 AM Al Viro <[email protected]> wrote:
> >
> > Linus, David - do you have any objections to the above?
>
> I damn well do.
>
> I explained earlier why it's wrong and fragile, and why it can just
> cause the *reverse* security problem if you do it wrong. So now you
> take a subtle bug, and make it even more subtle, and encourage people
> to do this known-broken model of using creds at IO time.
>
> No.
>
> Some debugging option to just clear current->creds entirely and catch
> mis-uses, sure. But saying "we have shit buggy garbage in random write
> functions, so we'll just paper over it"? No.

Huh? Nevermind ->write(), what about open()? Here's a specific question
Miklos brought when I suggested to get rid of that override:
/*
* These allocate and release file read/write context information.
*/
int nfs_open(struct inode *inode, struct file *filp)
{
struct nfs_open_context *ctx;

ctx = alloc_nfs_open_context(file_dentry(filp), filp->f_mode, filp);

struct nfs_open_context *alloc_nfs_open_context(struct dentry *dentry,
fmode_t f_mode,
struct file *filp)
{
struct nfs_open_context *ctx;
struct rpc_cred *cred = rpc_lookup_cred();

struct rpc_cred *rpc_lookup_cred(void)
{
return rpcauth_lookupcred(&generic_auth, 0);

struct rpc_cred *
rpcauth_lookupcred(struct rpc_auth *auth, int flags)
{
struct auth_cred acred;
struct rpc_cred *ret;
const struct cred *cred = current_cred();

How should we bring the cred passed to do_dentry_open() where open() has been
called to rpcauth_lookupcred() where we end up looking for rpc_cred by what
should've been the cred passed to do_dentry_open() and is, instead, current_cred()?

We can pass filp->f_cred to rpc_lookup_cred() variant that gets it as an explicit
argument and feed it down to rpcauth_lookupcred() variant that does the same.
We can basically ignore the ->f_cred here. Or we can get current_cred() equal
to ->f_cred for the duration of open().

I'd probably prefer the first variant, but the last part of the question Miklos
asked
> Okay, so ->open() is a file op, and file ops should use file->f_cred,
> but how are we going to enforce this?
is not trivial - how do we find the places where that kind of thing happens and
what do we do in the meanwhile? I don't see any quick answers - any suggestions
would be very welcome. It's not just direct current_cred() callers; that stuff
gets called deep in call chains. And lifting it all the way up means a lot of
methods that need to get an explicit struct cred * argument. Are you OK with
going in that direction?

I'm honestly not sure - it's not an attempt to maneuver you into changing your
policy re ->write(). Do we care about ->f_cred at all and if we do, how do we
get it consistent across the filesystems? I'd buy "it's a weird and obscure thing"
for overlayfs, but that example is on NFS...

We definitely do have bugs in that area - consider e.g.
static int ecryptfs_threadfn(void *ignored)
{
set_freezable();
while (1) {
struct ecryptfs_open_req *req;

wait_event_freezable(
ecryptfs_kthread_ctl.wait,
(!list_empty(&ecryptfs_kthread_ctl.req_list)
|| kthread_should_stop()));
mutex_lock(&ecryptfs_kthread_ctl.mux);
if (ecryptfs_kthread_ctl.flags & ECRYPTFS_KTHREAD_ZOMBIE) {
mutex_unlock(&ecryptfs_kthread_ctl.mux);
goto out;
}
while (!list_empty(&ecryptfs_kthread_ctl.req_list)) {
req = list_first_entry(&ecryptfs_kthread_ctl.req_list,
struct ecryptfs_open_req,
kthread_ctl_list);
list_del(&req->kthread_ctl_list);
*req->lower_file = dentry_open(&req->path,
(O_RDWR | O_LARGEFILE), current_cred());
complete(&req->done);
}
mutex_unlock(&ecryptfs_kthread_ctl.mux);
}
out:
return 0;
}

It's a kernel thread, so current_cred() looks bogus...

2018-07-18 19:57:51

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC] call_with_creds()

On Wed, Jul 18, 2018 at 12:46 PM Al Viro <[email protected]> wrote:
>
> Huh? Nevermind ->write(), what about open()?

What about open?

At open time, file->f_cred is the same as current_cred().

So yes, open uses current cred. What's the problem?

Now, if you then use a tasklet or some other thread to do the open,
then obviously that is no longer true. But then the problem is that
you're doing the open() itself in the wrong context, and that has
nothing to do with any general issue, and everything to do with "you
changed to another context without pulling all the context data with
you - you're buggy". Doing some kind of "call_with_creds()" isn't the
solultion - it's just part of the whole thing (what about user
accounting etc? If you switch to another thread to do the work, you
have way more issues than just creds).

Linus

2018-07-18 20:07:01

by Al Viro

[permalink] [raw]
Subject: Re: [RFC] call_with_creds()

On Wed, Jul 18, 2018 at 12:53:48PM -0700, Linus Torvalds wrote:
> On Wed, Jul 18, 2018 at 12:46 PM Al Viro <[email protected]> wrote:
> >
> > Huh? Nevermind ->write(), what about open()?
>
> What about open?
>
> At open time, file->f_cred is the same as current_cred().

int cachefiles_write_page(struct fscache_storage *op, struct page *page)
{
...
file = dentry_open(&path, O_RDWR | O_LARGEFILE, cache->cache_cred);


int ecryptfs_privileged_open(struct file **lower_file,
struct dentry *lower_dentry,
struct vfsmount *lower_mnt,
const struct cred *cred)
...
(*lower_file) = dentry_open(&req.path, flags, cred);


/* Derived from fs/exec.c:flush_old_files. */
static inline void flush_unauthorized_files(const struct cred *cred,
struct files_struct *files)
...
devnull = dentry_open(&selinux_null, O_RDWR, cred);

(granted, here we don't care much, /dev/null being what it is)

In mainline:
struct file *filp_clone_open(struct file *oldfile)
{
...
retval = vfs_open(&oldfile->f_path, file, oldfile->f_cred);



2018-07-18 20:17:27

by Al Viro

[permalink] [raw]
Subject: Re: [RFC] call_with_creds()

On Wed, Jul 18, 2018 at 09:04:11PM +0100, Al Viro wrote:
> On Wed, Jul 18, 2018 at 12:53:48PM -0700, Linus Torvalds wrote:
> > On Wed, Jul 18, 2018 at 12:46 PM Al Viro <[email protected]> wrote:
> > >
> > > Huh? Nevermind ->write(), what about open()?
> >
> > What about open?
> >
> > At open time, file->f_cred is the same as current_cred().
>
> int cachefiles_write_page(struct fscache_storage *op, struct page *page)
> {
> ...
> file = dentry_open(&path, O_RDWR | O_LARGEFILE, cache->cache_cred);
>
>
> int ecryptfs_privileged_open(struct file **lower_file,
> struct dentry *lower_dentry,
> struct vfsmount *lower_mnt,
> const struct cred *cred)
> ...
> (*lower_file) = dentry_open(&req.path, flags, cred);

Actually, scratch that one - in this case it is current_cred() (whether that's
the right value or not).

cachefile_write_page() case is for real, AFAICS.

2018-07-18 20:45:13

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC] call_with_creds()

On Wed, Jul 18, 2018 at 1:04 PM Al Viro <[email protected]> wrote:
>
>
> int cachefiles_write_page(struct fscache_storage *op, struct page *page)
> {
> ...
> file = dentry_open(&path, O_RDWR | O_LARGEFILE, cache->cache_cred);

Ugh. So on the one hand, in this case I'd actually be more ok with the
whole "call_with_creds()" model, because open() is actually *supposed*
to use creds.

At the same time, my stronger reaction is that "it's actually better
to just pass down the creds as a pointer like we do".

So I think the real problem is simply that people use the current
creds without thinking about it. Often it's just hidden in a
"capable(CAP_SYS_ADMIN)" call or something like that, where people
don't even _think_ about the whole thing.

What I really think I'd prefer is to have some simple way to "poison"
current_cred(). It could be something as simple as a per-thread
counter, and we'd have current_cred() do

WARN_ON_ONCE(in_interrupt() || current->cred_poison);

and then read/write/open could just inc/dec the cred_poison counter
(when the debug option is set).

We wouldn't even need to catch all the odd cases. We'd only need to
inc/dec the poison counter in the normal read/write/open cases, not
even worrying about the odd special cases (like that
cachefiles_write_page() case).

Why? Because then we'd make sure the filesystems do the right thing,
and then cachefiles_write_page() etc wouldn't even have to worry,
because all the _common_ open cases have already tested that yes, it
uses the right cred pointer and doesn't try to get whatever "current"
happens to be.

So I'd really like to just fix the cases where we access the wrong
creds. And I'd be more than happy to add a few wrapper functions to
make us _see_ the cases where we do so.

What I don't like is the idea of trying to make bad creds accesses
"magically do the right thing".

See what I'm aiming for?

Linus

2018-07-18 21:24:48

by Al Viro

[permalink] [raw]
Subject: Re: [RFC] call_with_creds()

On Wed, Jul 18, 2018 at 01:43:00PM -0700, Linus Torvalds wrote:

> What I really think I'd prefer is to have some simple way to "poison"
> current_cred(). It could be something as simple as a per-thread
> counter, and we'd have current_cred() do
>
> WARN_ON_ONCE(in_interrupt() || current->cred_poison);

Why is counter any better than LSB of a pointer?

2018-07-18 21:29:42

by David Howells

[permalink] [raw]
Subject: Re: [RFC] call_with_creds()

Linus Torvalds <[email protected]> wrote:

> I explained earlier why it's wrong and fragile, and why it can just
> cause the *reverse* security problem if you do it wrong. So now you
> take a subtle bug, and make it even more subtle, and encourage people
> to do this known-broken model of using creds at IO time.

Are network filesystems allowed to use f_cred at I/O time to determine the
authentication/encryption parameters to commune with the server?

David

2018-07-18 21:29:55

by David Howells

[permalink] [raw]
Subject: Re: [RFC] call_with_creds()

Linus Torvalds <[email protected]> wrote:

> and then read/write/open could just inc/dec the cred_poison counter
> (when the debug option is set).

As I may have said, I have tried modifying the kernel to pass the cred pointer
down. The drivers and ioctl() implementations are/were particularly nasty in
this respect. So many of them were doing checks against the current thread,
not f_cred.

I think I need to work out some way to automate the process of adding in the
extra parameter as it's not something that I think can be trivially done with
coccinelle.

David

2018-07-18 23:07:49

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC] call_with_creds()

On Wed, Jul 18, 2018 at 2:22 PM Al Viro <[email protected]> wrote:
>
> Why is counter any better than LSB of a pointer?

Easier nesting, but if you do it with the "surround by a macro" model
I guess you can just save/restore instead (like you did in
call_with_creds).

Linus

2018-07-18 23:15:51

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC] call_with_creds()

On Wed, Jul 18, 2018 at 2:28 PM David Howells <[email protected]> wrote:
>
> Are network filesystems allowed to use f_cred at I/O time to determine the
> authentication/encryption parameters to commune with the server?

Absolutely. file->f_cred is very much "what was my ID at open time".

Of course, you may well have reasons why you actually want to cache
the key itself (and hide it in private_data or similar rather than
look it up, but if looking it up by uid is ok, then file->f_cred is
ok.

And if you check permissions at IO time (again using file->f_cred),
that's ok from a kernel perspective, but it's not really
POSIX-compliant. But obviously a lot of netrwork filesystems aren't
posix-compliant anyway.

Linus

2018-07-18 23:17:58

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC] call_with_creds()

On Wed, Jul 18, 2018 at 2:27 PM David Howells <[email protected]> wrote:
>
> As I may have said, I have tried modifying the kernel to pass the cred pointer
> down.

It should always be there in the 'struct file *'.

Now, we may have some broken stuff that passes only inodes down, but
they probably really should be fixed.

> The drivers and ioctl() implementations are/were particularly nasty in
> this respect. So many of them were doing checks against the current thread,
> not f_cred.

So ioctl() may be ok, simply because at least you shouldn't be able to
fool suid programs to do ioctl's on untrusted file descriptors.

So using current_cred() is still technically very wrong, but it's
probably not a huge problem in practice.

Now, if there's some cachefs kind of "do ioctl at the behest of
somebody else", then *that* would be a problem. I'm hoping there
isn't.

Linus