2022-12-17 19:24:51

by Eric Van Hensbergen

[permalink] [raw]
Subject: [PATCH] Improve 9p performance for read operations

This patch series adds a number of features to improve read/write
performance in the 9p filesystem. Mostly it is focused on fixing
readahead caching to help utilize the recently increased MSIZE
limits, but there are also some fixes for writeback caches in the
presence of readahead and/or mmap operations.

Altogether, these show roughly 10x speed increases on simple
file transfer. Future patch sets will improve writeback cache
behavior and directory caching.

These patches are also available on github:
https://github.com/v9fs/linux/tree/ericvh/9p-next-121722

Tested against qemu, cpu, and diod with fsx, dbench, and
some simple benchmarks.

Signed-off-by: Eric Van Hensbergen <[email protected]>



2022-12-17 19:25:10

by Eric Van Hensbergen

[permalink] [raw]
Subject: [PATCH 4/6] Consolidate file operations and add readahead and writeback

We had 3 different sets of file operations across 2 different protocol
variants differentiated by cache which really only changed 3
functions. But the real problem is that certain file modes, mount
options, and other factors weren't being considered when we
decided whether or not to use caches.

This consolidates all the operations and switches
to conditionals within a common set to decide whether or not
to do different aspects of caching.

Signed-off-by: Eric Van Hensbergen <[email protected]>
---
fs/9p/v9fs.c | 30 ++++------
fs/9p/v9fs.h | 2 +
fs/9p/v9fs_vfs.h | 4 --
fs/9p/vfs_dir.c | 9 +++
fs/9p/vfs_file.c | 123 +++++++----------------------------------
fs/9p/vfs_inode.c | 31 ++++-------
fs/9p/vfs_inode_dotl.c | 19 ++++++-
7 files changed, 71 insertions(+), 147 deletions(-)

diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c
index 61a51b90600d..a46bf9121f11 100644
--- a/fs/9p/v9fs.c
+++ b/fs/9p/v9fs.c
@@ -39,8 +39,6 @@ enum {
Opt_uname, Opt_remotename, Opt_cache, Opt_cachetag,
/* Options that take no arguments */
Opt_nodevmap,
- /* Cache options */
- Opt_cache_loose, Opt_fscache, Opt_mmap,
/* Access options */
Opt_access, Opt_posixacl,
/* Lock timeout option */
@@ -58,9 +56,6 @@ static const match_table_t tokens = {
{Opt_remotename, "aname=%s"},
{Opt_nodevmap, "nodevmap"},
{Opt_cache, "cache=%s"},
- {Opt_cache_loose, "loose"},
- {Opt_fscache, "fscache"},
- {Opt_mmap, "mmap"},
{Opt_cachetag, "cachetag=%s"},
{Opt_access, "access=%s"},
{Opt_posixacl, "posixacl"},
@@ -69,10 +64,12 @@ static const match_table_t tokens = {
};

static const char *const v9fs_cache_modes[nr__p9_cache_modes] = {
- [CACHE_NONE] = "none",
- [CACHE_MMAP] = "mmap",
- [CACHE_LOOSE] = "loose",
- [CACHE_FSCACHE] = "fscache",
+ [CACHE_NONE] = "none",
+ [CACHE_READAHEAD] = "readahead",
+ [CACHE_WRITEBACK] = "writeback",
+ [CACHE_MMAP] = "mmap",
+ [CACHE_LOOSE] = "loose",
+ [CACHE_FSCACHE] = "fscache",
};

/* Interpret mount options for cache mode */
@@ -89,6 +86,12 @@ static int get_cache_mode(char *s)
} else if (!strcmp(s, "mmap")) {
version = CACHE_MMAP;
p9_debug(P9_DEBUG_9P, "Cache mode: mmap\n");
+ } else if (!strcmp(s, "writeback")) {
+ version = CACHE_WRITEBACK;
+ p9_debug(P9_DEBUG_9P, "Cache mode: writeback\n");
+ } else if (!strcmp(s, "readahead")) {
+ version = CACHE_READAHEAD;
+ p9_debug(P9_DEBUG_9P, "Cache mode: readahead\n");
} else if (!strcmp(s, "none")) {
version = CACHE_NONE;
p9_debug(P9_DEBUG_9P, "Cache mode: none\n");
@@ -266,15 +269,6 @@ static int v9fs_parse_options(struct v9fs_session_info *v9ses, char *opts)
case Opt_nodevmap:
v9ses->nodev = 1;
break;
- case Opt_cache_loose:
- v9ses->cache = CACHE_LOOSE;
- break;
- case Opt_fscache:
- v9ses->cache = CACHE_FSCACHE;
- break;
- case Opt_mmap:
- v9ses->cache = CACHE_MMAP;
- break;
case Opt_cachetag:
#ifdef CONFIG_9P_FSCACHE
kfree(v9ses->cachetag);
diff --git a/fs/9p/v9fs.h b/fs/9p/v9fs.h
index 6acabc2e7dc9..5813967ecdf0 100644
--- a/fs/9p/v9fs.h
+++ b/fs/9p/v9fs.h
@@ -51,6 +51,8 @@ enum p9_session_flags {
enum p9_cache_modes {
CACHE_NONE,
CACHE_MMAP,
+ CACHE_READAHEAD,
+ CACHE_WRITEBACK,
CACHE_LOOSE,
CACHE_FSCACHE,
nr__p9_cache_modes
diff --git a/fs/9p/v9fs_vfs.h b/fs/9p/v9fs_vfs.h
index bc417da7e9c1..cce87c9bdd8b 100644
--- a/fs/9p/v9fs_vfs.h
+++ b/fs/9p/v9fs_vfs.h
@@ -36,10 +36,6 @@ extern const struct file_operations v9fs_dir_operations;
extern const struct file_operations v9fs_dir_operations_dotl;
extern const struct dentry_operations v9fs_dentry_operations;
extern const struct dentry_operations v9fs_cached_dentry_operations;
-extern const struct file_operations v9fs_cached_file_operations;
-extern const struct file_operations v9fs_cached_file_operations_dotl;
-extern const struct file_operations v9fs_mmap_file_operations;
-extern const struct file_operations v9fs_mmap_file_operations_dotl;
extern struct kmem_cache *v9fs_inode_cache;

struct inode *v9fs_alloc_inode(struct super_block *sb);
diff --git a/fs/9p/vfs_dir.c b/fs/9p/vfs_dir.c
index 1675a196c2ba..536769cdf7c8 100644
--- a/fs/9p/vfs_dir.c
+++ b/fs/9p/vfs_dir.c
@@ -214,6 +214,15 @@ int v9fs_dir_release(struct inode *inode, struct file *filp)
p9_debug(P9_DEBUG_VFS, "inode: %p filp: %p fid: %d\n",
inode, filp, fid ? fid->fid : -1);
if (fid) {
+ if ((fid->qid.type == P9_QTFILE) && (filp->f_mode & FMODE_WRITE)) {
+ int retval = file_write_and_wait_range(filp, 0, -1);
+
+ if (retval != 0) {
+ p9_debug(P9_DEBUG_ERROR,
+ "trying to flush filp %p failed with error code %d\n",
+ filp, retval);
+ }
+ }
spin_lock(&inode->i_lock);
hlist_del(&fid->ilist);
spin_unlock(&inode->i_lock);
diff --git a/fs/9p/vfs_file.c b/fs/9p/vfs_file.c
index 3b6458846a0b..64158664dcb4 100644
--- a/fs/9p/vfs_file.c
+++ b/fs/9p/vfs_file.c
@@ -73,7 +73,7 @@ int v9fs_file_open(struct inode *inode, struct file *file)
}

mutex_lock(&v9inode->v_mutex);
- if ((v9ses->cache) && !v9inode->writeback_fid &&
+ if ((v9ses->cache >= CACHE_WRITEBACK) && !v9inode->writeback_fid &&
((file->f_flags & O_ACCMODE) != O_RDONLY)) {
/*
* clone a fid and add it to writeback_fid
@@ -367,10 +367,15 @@ v9fs_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
{
struct p9_fid *fid = iocb->ki_filp->private_data;
int ret, err = 0;
+ struct inode *inode = file_inode(iocb->ki_filp);
+ struct v9fs_session_info *v9ses = v9fs_inode2v9ses(inode);

p9_debug(P9_DEBUG_VFS, "count %zu offset %lld\n",
iov_iter_count(to), iocb->ki_pos);

+ if (v9ses->cache > CACHE_MMAP)
+ return generic_file_read_iter(iocb, to);
+
if (iocb->ki_filp->f_flags & O_NONBLOCK)
ret = p9_client_read_once(fid, iocb->ki_pos, to, &err);
else
@@ -395,6 +400,11 @@ v9fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
ssize_t retval;
loff_t origin;
int err = 0;
+ struct inode *inode = file_inode(iocb->ki_filp);
+ struct v9fs_session_info *v9ses = v9fs_inode2v9ses(inode);
+
+ if (v9ses->cache >= CACHE_WRITEBACK)
+ return generic_file_write_iter(iocb, from);

retval = generic_write_checks(iocb, from);
if (retval <= 0)
@@ -477,25 +487,16 @@ static int
v9fs_file_mmap(struct file *filp, struct vm_area_struct *vma)
{
int retval;
-
-
- retval = generic_file_mmap(filp, vma);
- if (!retval)
- vma->vm_ops = &v9fs_file_vm_ops;
-
- return retval;
-}
-
-static int
-v9fs_mmap_file_mmap(struct file *filp, struct vm_area_struct *vma)
-{
- int retval;
- struct inode *inode;
- struct v9fs_inode *v9inode;
+ struct inode *inode = file_inode(filp);
+ struct v9fs_inode *v9inode = V9FS_I(inode);
+ struct v9fs_session_info *v9ses = v9fs_inode2v9ses(inode);
struct p9_fid *fid;

- inode = file_inode(filp);
- v9inode = V9FS_I(inode);
+ if (v9ses->cache < CACHE_MMAP) {
+ invalidate_inode_pages2(filp->f_mapping);
+ return generic_file_readonly_mmap(filp, vma);
+ }
+
mutex_lock(&v9inode->v_mutex);
if (!v9inode->writeback_fid &&
(vma->vm_flags & VM_SHARED) &&
@@ -563,35 +564,6 @@ v9fs_vm_page_mkwrite(struct vm_fault *vmf)
return VM_FAULT_NOPAGE;
}

-/**
- * v9fs_mmap_file_read_iter - read from a file
- * @iocb: The operation parameters
- * @to: The buffer to read into
- *
- */
-static ssize_t
-v9fs_mmap_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
-{
- /* TODO: Check if there are dirty pages */
- return v9fs_file_read_iter(iocb, to);
-}
-
-/**
- * v9fs_mmap_file_write_iter - write to a file
- * @iocb: The operation parameters
- * @from: The data to write
- *
- */
-static ssize_t
-v9fs_mmap_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
-{
- /*
- * TODO: invalidate mmaps on filp's inode between
- * offset and offset+count
- */
- return v9fs_file_write_iter(iocb, from);
-}
-
static void v9fs_mmap_vm_close(struct vm_area_struct *vma)
{
struct inode *inode;
@@ -628,34 +600,6 @@ static const struct vm_operations_struct v9fs_mmap_file_vm_ops = {
.page_mkwrite = v9fs_vm_page_mkwrite,
};

-
-const struct file_operations v9fs_cached_file_operations = {
- .llseek = generic_file_llseek,
- .read_iter = generic_file_read_iter,
- .write_iter = generic_file_write_iter,
- .open = v9fs_file_open,
- .release = v9fs_dir_release,
- .lock = v9fs_file_lock,
- .mmap = v9fs_file_mmap,
- .splice_read = generic_file_splice_read,
- .splice_write = iter_file_splice_write,
- .fsync = v9fs_file_fsync,
-};
-
-const struct file_operations v9fs_cached_file_operations_dotl = {
- .llseek = generic_file_llseek,
- .read_iter = generic_file_read_iter,
- .write_iter = generic_file_write_iter,
- .open = v9fs_file_open,
- .release = v9fs_dir_release,
- .lock = v9fs_file_lock_dotl,
- .flock = v9fs_file_flock_dotl,
- .mmap = v9fs_file_mmap,
- .splice_read = generic_file_splice_read,
- .splice_write = iter_file_splice_write,
- .fsync = v9fs_file_fsync_dotl,
-};
-
const struct file_operations v9fs_file_operations = {
.llseek = generic_file_llseek,
.read_iter = v9fs_file_read_iter,
@@ -677,34 +621,7 @@ const struct file_operations v9fs_file_operations_dotl = {
.release = v9fs_dir_release,
.lock = v9fs_file_lock_dotl,
.flock = v9fs_file_flock_dotl,
- .mmap = generic_file_readonly_mmap,
- .splice_read = generic_file_splice_read,
- .splice_write = iter_file_splice_write,
- .fsync = v9fs_file_fsync_dotl,
-};
-
-const struct file_operations v9fs_mmap_file_operations = {
- .llseek = generic_file_llseek,
- .read_iter = v9fs_mmap_file_read_iter,
- .write_iter = v9fs_mmap_file_write_iter,
- .open = v9fs_file_open,
- .release = v9fs_dir_release,
- .lock = v9fs_file_lock,
- .mmap = v9fs_mmap_file_mmap,
- .splice_read = generic_file_splice_read,
- .splice_write = iter_file_splice_write,
- .fsync = v9fs_file_fsync,
-};
-
-const struct file_operations v9fs_mmap_file_operations_dotl = {
- .llseek = generic_file_llseek,
- .read_iter = v9fs_mmap_file_read_iter,
- .write_iter = v9fs_mmap_file_write_iter,
- .open = v9fs_file_open,
- .release = v9fs_dir_release,
- .lock = v9fs_file_lock_dotl,
- .flock = v9fs_file_flock_dotl,
- .mmap = v9fs_mmap_file_mmap,
+ .mmap = v9fs_file_mmap,
.splice_read = generic_file_splice_read,
.splice_write = iter_file_splice_write,
.fsync = v9fs_file_fsync_dotl,
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 33e521c60e2c..de99f9275a94 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -287,24 +287,10 @@ int v9fs_init_inode(struct v9fs_session_info *v9ses,
case S_IFREG:
if (v9fs_proto_dotl(v9ses)) {
inode->i_op = &v9fs_file_inode_operations_dotl;
- if (v9ses->cache == CACHE_LOOSE ||
- v9ses->cache == CACHE_FSCACHE)
- inode->i_fop =
- &v9fs_cached_file_operations_dotl;
- else if (v9ses->cache == CACHE_MMAP)
- inode->i_fop = &v9fs_mmap_file_operations_dotl;
- else
- inode->i_fop = &v9fs_file_operations_dotl;
+ inode->i_fop = &v9fs_file_operations_dotl;
} else {
inode->i_op = &v9fs_file_inode_operations;
- if (v9ses->cache == CACHE_LOOSE ||
- v9ses->cache == CACHE_FSCACHE)
- inode->i_fop =
- &v9fs_cached_file_operations;
- else if (v9ses->cache == CACHE_MMAP)
- inode->i_fop = &v9fs_mmap_file_operations;
- else
- inode->i_fop = &v9fs_file_operations;
+ inode->i_fop = &v9fs_file_operations;
}

break;
@@ -843,7 +829,7 @@ v9fs_vfs_atomic_open(struct inode *dir, struct dentry *dentry,
inode = d_inode(dentry);
v9inode = V9FS_I(inode);
mutex_lock(&v9inode->v_mutex);
- if ((v9ses->cache) && !v9inode->writeback_fid &&
+ if ((v9ses->cache >= CACHE_WRITEBACK) && !v9inode->writeback_fid &&
((flags & O_ACCMODE) != O_RDONLY)) {
/*
* clone a fid and add it to writeback_fid
@@ -1030,6 +1016,7 @@ v9fs_vfs_getattr(struct user_namespace *mnt_userns, const struct path *path,
struct kstat *stat, u32 request_mask, unsigned int flags)
{
struct dentry *dentry = path->dentry;
+ struct inode *inode = d_inode(dentry);
struct v9fs_session_info *v9ses;
struct p9_fid *fid;
struct p9_wstat *st;
@@ -1039,6 +1026,9 @@ v9fs_vfs_getattr(struct user_namespace *mnt_userns, const struct path *path,
if (v9ses->cache == CACHE_LOOSE || v9ses->cache == CACHE_FSCACHE) {
generic_fillattr(&init_user_ns, d_inode(dentry), stat);
return 0;
+ } else if (v9ses->cache >= CACHE_WRITEBACK) {
+ if (S_ISREG(inode->i_mode))
+ filemap_write_and_wait(inode->i_mapping);
}
fid = v9fs_fid_lookup(dentry);
if (IS_ERR(fid))
@@ -1127,9 +1117,12 @@ static int v9fs_vfs_setattr(struct user_namespace *mnt_userns,
return retval;

if ((iattr->ia_valid & ATTR_SIZE) &&
- iattr->ia_size != i_size_read(inode)) {
+ iattr->ia_size != i_size_read(inode)) {
truncate_setsize(inode, iattr->ia_size);
- fscache_resize_cookie(v9fs_inode_cookie(v9inode), iattr->ia_size);
+ if (v9ses->cache == CACHE_FSCACHE)
+ fscache_resize_cookie(v9fs_inode_cookie(v9inode), iattr->ia_size);
+ else
+ invalidate_mapping_pages(&inode->i_data, 0, -1);
}

v9fs_invalidate_inode_attr(inode);
diff --git a/fs/9p/vfs_inode_dotl.c b/fs/9p/vfs_inode_dotl.c
index 9fde73ffadaa..708cd728cc70 100644
--- a/fs/9p/vfs_inode_dotl.c
+++ b/fs/9p/vfs_inode_dotl.c
@@ -458,6 +458,7 @@ v9fs_vfs_getattr_dotl(struct user_namespace *mnt_userns,
struct dentry *dentry = path->dentry;
struct v9fs_session_info *v9ses;
struct p9_fid *fid;
+ struct inode *inode = d_inode(dentry);
struct p9_stat_dotl *st;

p9_debug(P9_DEBUG_VFS, "dentry: %p\n", dentry);
@@ -465,6 +466,9 @@ v9fs_vfs_getattr_dotl(struct user_namespace *mnt_userns,
if (v9ses->cache == CACHE_LOOSE || v9ses->cache == CACHE_FSCACHE) {
generic_fillattr(&init_user_ns, d_inode(dentry), stat);
return 0;
+ } else if (v9ses->cache >= CACHE_WRITEBACK) {
+ if (S_ISREG(inode->i_mode))
+ filemap_write_and_wait(inode->i_mapping);
}
fid = v9fs_fid_lookup(dentry);
if (IS_ERR(fid))
@@ -540,12 +544,14 @@ int v9fs_vfs_setattr_dotl(struct user_namespace *mnt_userns,
struct dentry *dentry, struct iattr *iattr)
{
int retval, use_dentry = 0;
+ struct inode *inode = d_inode(dentry);
+ struct v9fs_inode *v9inode = V9FS_I(inode);
+ struct v9fs_session_info *v9ses;
struct p9_fid *fid = NULL;
struct p9_iattr_dotl p9attr = {
.uid = INVALID_UID,
.gid = INVALID_GID,
};
- struct inode *inode = d_inode(dentry);

p9_debug(P9_DEBUG_VFS, "\n");

@@ -553,6 +559,8 @@ int v9fs_vfs_setattr_dotl(struct user_namespace *mnt_userns,
if (retval)
return retval;

+ v9ses = v9fs_dentry2v9ses(dentry);
+
p9attr.valid = v9fs_mapped_iattr_valid(iattr->ia_valid);
if (iattr->ia_valid & ATTR_MODE)
p9attr.mode = iattr->ia_mode;
@@ -593,9 +601,14 @@ int v9fs_vfs_setattr_dotl(struct user_namespace *mnt_userns,
return retval;
}

- if ((iattr->ia_valid & ATTR_SIZE) &&
- iattr->ia_size != i_size_read(inode))
+ if ((iattr->ia_valid & ATTR_SIZE) && iattr->ia_size !=
+ i_size_read(inode)) {
truncate_setsize(inode, iattr->ia_size);
+ if (v9ses->cache == CACHE_FSCACHE)
+ fscache_resize_cookie(v9fs_inode_cookie(v9inode), iattr->ia_size);
+ else
+ invalidate_mapping_pages(&inode->i_data, 0, -1);
+ }

v9fs_invalidate_inode_attr(inode);
setattr_copy(&init_user_ns, inode, iattr);
--
2.37.2

2022-12-17 19:40:26

by Eric Van Hensbergen

[permalink] [raw]
Subject: [PATCH 2/6] Don't assume UID 0 attach

The writeback_fid fallback code assumes a root uid fallback which
breaks many server configurations (which don't run as root). This
patch switches to generic lookup which will follow argument
guidence on access modes and default ids to use on failure.

There is a deeper underlying problem with writeback_fids in that
this fallback is too standard and not an exception due to the way
writeback mode works in the current implementation. Subsequent
patches will try to associate writeback fids from the original user
either by flushing on close or by holding onto fid until writeback
completes.

Signed-off-by: Eric Van Hensbergen <[email protected]>
---
fs/9p/fid.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/9p/fid.c b/fs/9p/fid.c
index 805151114e96..1fbd12d581bb 100644
--- a/fs/9p/fid.c
+++ b/fs/9p/fid.c
@@ -304,7 +304,9 @@ struct p9_fid *v9fs_writeback_fid(struct dentry *dentry)
int err;
struct p9_fid *fid, *ofid;

- ofid = v9fs_fid_lookup_with_uid(dentry, GLOBAL_ROOT_UID, 0);
+ /* pull default uid from dfltuid */
+
+ ofid = v9fs_fid_lookup(dentry);
fid = clone_fid(ofid);
if (IS_ERR(fid))
goto error_out;
--
2.37.2

2022-12-17 20:59:59

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 4/6] Consolidate file operations and add readahead and writeback

Hi Eric,

I love your patch! Perhaps something to improve:

[auto build test WARNING on v6.1]
[also build test WARNING on linus/master next-20221216]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Eric-Van-Hensbergen/Adjust-maximum-MSIZE-to-account-for-p9-header/20221218-035253
patch link: https://lore.kernel.org/r/20221217185210.1431478-5-evanhensbergen%40icloud.com
patch subject: [PATCH 4/6] Consolidate file operations and add readahead and writeback
config: i386-defconfig
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce (this is a W=1 build):
# https://github.com/intel-lab-lkp/linux/commit/3cb1fb650908432b5df0279ca61f483e09ce2478
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Eric-Van-Hensbergen/Adjust-maximum-MSIZE-to-account-for-p9-header/20221218-035253
git checkout 3cb1fb650908432b5df0279ca61f483e09ce2478
# save the config file
mkdir build_dir && cp config build_dir/.config
make W=1 O=build_dir ARCH=i386 olddefconfig
make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash fs/9p/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>

All warnings (new ones prefixed by >>):

>> fs/9p/vfs_file.c:591:42: warning: 'v9fs_file_vm_ops' defined but not used [-Wunused-const-variable=]
591 | static const struct vm_operations_struct v9fs_file_vm_ops = {
| ^~~~~~~~~~~~~~~~


vim +/v9fs_file_vm_ops +591 fs/9p/vfs_file.c

fb89b45cdfdc8b Dominique Martinet 2014-01-10 589
fb89b45cdfdc8b Dominique Martinet 2014-01-10 590
7263cebed9fada Aneesh Kumar K.V 2011-02-28 @591 static const struct vm_operations_struct v9fs_file_vm_ops = {
7263cebed9fada Aneesh Kumar K.V 2011-02-28 592 .fault = filemap_fault,
f1820361f83d55 Kirill A. Shutemov 2014-04-07 593 .map_pages = filemap_map_pages,
7263cebed9fada Aneesh Kumar K.V 2011-02-28 594 .page_mkwrite = v9fs_vm_page_mkwrite,
7263cebed9fada Aneesh Kumar K.V 2011-02-28 595 };
7263cebed9fada Aneesh Kumar K.V 2011-02-28 596

--
0-DAY CI Kernel Test Service
https://01.org/lkp


Attachments:
(No filename) (2.40 kB)
config (136.49 kB)
Download all attachments

2022-12-17 23:13:05

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 4/6] Consolidate file operations and add readahead and writeback

Hi Eric,

I love your patch! Perhaps something to improve:

[auto build test WARNING on v6.1]
[also build test WARNING on linus/master next-20221216]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Eric-Van-Hensbergen/Adjust-maximum-MSIZE-to-account-for-p9-header/20221218-035253
patch link: https://lore.kernel.org/r/20221217185210.1431478-5-evanhensbergen%40icloud.com
patch subject: [PATCH 4/6] Consolidate file operations and add readahead and writeback
config: hexagon-randconfig-r041-20221218
compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 98b13979fb05f3ed288a900deb843e7b27589e58)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/3cb1fb650908432b5df0279ca61f483e09ce2478
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Eric-Van-Hensbergen/Adjust-maximum-MSIZE-to-account-for-p9-header/20221218-035253
git checkout 3cb1fb650908432b5df0279ca61f483e09ce2478
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=hexagon olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=hexagon SHELL=/bin/bash fs/9p/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>

All warnings (new ones prefixed by >>):

In file included from fs/9p/vfs_file.c:16:
In file included from include/linux/inet.h:42:
In file included from include/net/net_namespace.h:43:
In file included from include/linux/skbuff.h:17:
In file included from include/linux/bvec.h:10:
In file included from include/linux/highmem.h:12:
In file included from include/linux/hardirq.h:11:
In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
In file included from include/asm-generic/hardirq.h:17:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:13:
In file included from arch/hexagon/include/asm/io.h:334:
include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __raw_readb(PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
~~~~~~~~~~ ^
include/uapi/linux/byteorder/little_endian.h:37:51: note: expanded from macro '__le16_to_cpu'
#define __le16_to_cpu(x) ((__force __u16)(__le16)(x))
^
In file included from fs/9p/vfs_file.c:16:
In file included from include/linux/inet.h:42:
In file included from include/net/net_namespace.h:43:
In file included from include/linux/skbuff.h:17:
In file included from include/linux/bvec.h:10:
In file included from include/linux/highmem.h:12:
In file included from include/linux/hardirq.h:11:
In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
In file included from include/asm-generic/hardirq.h:17:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:13:
In file included from arch/hexagon/include/asm/io.h:334:
include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
~~~~~~~~~~ ^
include/uapi/linux/byteorder/little_endian.h:35:51: note: expanded from macro '__le32_to_cpu'
#define __le32_to_cpu(x) ((__force __u32)(__le32)(x))
^
In file included from fs/9p/vfs_file.c:16:
In file included from include/linux/inet.h:42:
In file included from include/net/net_namespace.h:43:
In file included from include/linux/skbuff.h:17:
In file included from include/linux/bvec.h:10:
In file included from include/linux/highmem.h:12:
In file included from include/linux/hardirq.h:11:
In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
In file included from include/asm-generic/hardirq.h:17:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:13:
In file included from arch/hexagon/include/asm/io.h:334:
include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writeb(value, PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
~~~~~~~~~~ ^
>> fs/9p/vfs_file.c:591:42: warning: unused variable 'v9fs_file_vm_ops' [-Wunused-const-variable]
static const struct vm_operations_struct v9fs_file_vm_ops = {
^
7 warnings generated.


vim +/v9fs_file_vm_ops +591 fs/9p/vfs_file.c

fb89b45cdfdc8b Dominique Martinet 2014-01-10 589
fb89b45cdfdc8b Dominique Martinet 2014-01-10 590
7263cebed9fada Aneesh Kumar K.V 2011-02-28 @591 static const struct vm_operations_struct v9fs_file_vm_ops = {
7263cebed9fada Aneesh Kumar K.V 2011-02-28 592 .fault = filemap_fault,
f1820361f83d55 Kirill A. Shutemov 2014-04-07 593 .map_pages = filemap_map_pages,
7263cebed9fada Aneesh Kumar K.V 2011-02-28 594 .page_mkwrite = v9fs_vm_page_mkwrite,
7263cebed9fada Aneesh Kumar K.V 2011-02-28 595 };
7263cebed9fada Aneesh Kumar K.V 2011-02-28 596

--
0-DAY CI Kernel Test Service
https://01.org/lkp


Attachments:
(No filename) (6.71 kB)
config (149.84 kB)
Download all attachments

2022-12-18 00:30:27

by Dominique Martinet

[permalink] [raw]
Subject: Re: [PATCH 2/6] Don't assume UID 0 attach

Eric Van Hensbergen wrote on Sat, Dec 17, 2022 at 06:52:06PM +0000:
> The writeback_fid fallback code assumes a root uid fallback which
> breaks many server configurations (which don't run as root). This
> patch switches to generic lookup which will follow argument
> guidence on access modes and default ids to use on failure.

Unfortunately this one will break writes to a file created as 400 I
think
That's the main reason we have this writeback fid afaik -- there are
cases where the user should be able to write to the file, but a plain
open/write won't work... I can't think of anything else than open 400
right now though

I'm sure there's an xfs_io command and xfstest for that, but for now:
python3 -c 'import os; f = os.open("testfile", os.O_CREAT + os.O_RDWR, 0o400); os.write(f, b"ok\n")'

iirc ganesha running as non-root just ignores root requests and opens as
current user-- this won't work for this particular case, but might be
good enough for you... With that said:

> There is a deeper underlying problem with writeback_fids in that
> this fallback is too standard and not an exception due to the way
> writeback mode works in the current implementation. Subsequent
> patches will try to associate writeback fids from the original user
> either by flushing on close or by holding onto fid until writeback
> completes.

If we can address this problem though I agree we should stop using
wrieback fids as much as we do.
Now fids are refcounted, I think we could just use the normal fid as
writeback fid (getting a ref), and the regular close will not clunk it
so delayed IOs will pass.

Worth a try?
--
Dominique

2022-12-18 01:38:01

by Eric Van Hensbergen

[permalink] [raw]
Subject: Re: [PATCH 2/6] Don't assume UID 0 attach



> On Dec 17, 2022, at 6:07 PM, [email protected] wrote:
>
> Eric Van Hensbergen wrote on Sat, Dec 17, 2022 at 06:52:06PM +0000:
>> The writeback_fid fallback code assumes a root uid fallback which
>> breaks many server configurations (which don't run as root). This
>> patch switches to generic lookup which will follow argument
>> guidence on access modes and default ids to use on failure.
>
> Unfortunately this one will break writes to a file created as 400 I
> think
> That's the main reason we have this writeback fid afaik -- there are
> cases where the user should be able to write to the file, but a plain
> open/write won't work... I can't think of anything else than open 400
> right now though
>

I’ll try and craft a test case for this, but I think it works?
That being said, I haven’t been trying the xfstests, just fsx and bench.

> I'm sure there's an xfs_io command and xfstest for that, but for now:
> python3 -c 'import os; f = os.open("testfile", os.O_CREAT + os.O_RDWR, 0o400); os.write(f, b"ok\n")'
>
> iirc ganesha running as non-root just ignores root requests and opens as
> current user-- this won't work for this particular case, but might be
> good enough for you... With that said:

Yeah, the real problem I ran into this was if the server is running as non-root this causes issues and I was testing against cpu (which doesn’t run as root). I need to go back and check, but if you are running as root and dftuid=0 then the behavior should be the same as before?
In any case, I’ll try to go back and make this work — my big issue was always using uid 0 regardless of what mount options said is Wong.

>
>> There is a deeper underlying problem with writeback_fids in that
>> this fallback is too standard and not an exception due to the way
>> writeback mode works in the current implementation. Subsequent
>> patches will try to associate writeback fids from the original user
>> either by flushing on close or by holding onto fid until writeback
>> completes.
>
> If we can address this problem though I agree we should stop using
> wrieback fids as much as we do.
> Now fids are refcounted, I think we could just use the normal fid as
> writeback fid (getting a ref), and the regular close will not clunk it
> so delayed IOs will pass.
>
> Worth a try?

Yeah, that (using regular fids) is exactly what I am doing in my write back-fix patch which isn’t part of this series. I was still hunting a few bugs, but I think I nailed them today. I have to do a more extensive test sweep of the different configs, but unit tests seem good to go now so if I end up reworking the patch set to address your comment above, I may just go ahead and add it to the resubmit set. However, I also go ahead and flush on close/clunk — and that gets rid of the delayed write back which I think is actually preferable anyways. I may re-introduce it with temporal caching, but its just so problematic…..

-Eric