As requested by Jan this is a 4-part series.
I diffed this against fs-next + my inode hash patch v3 as it adds one
i_state = 0 case. Should that hash thing not be accepted this bit is
trivially droppable from the patch.
Mateusz Guzik (4):
xfs: preserve i_state around inode_init_always in xfs_reinit_inode
vfs: partially sanitize i_state zeroing on inode creation
xfs: remove now spurious i_state initialization in xfs_inode_alloc
bcachefs: remove now spurious i_state initialization
fs/bcachefs/fs.c | 1 -
fs/inode.c | 13 +++----------
fs/xfs/xfs_icache.c | 5 +++--
3 files changed, 6 insertions(+), 13 deletions(-)
--
2.43.0
This is in preparation for the routine starting to zero the field.
De facto coded by Dave Chinner, see:
https://lore.kernel.org/linux-fsdevel/[email protected]/
Signed-off-by: Mateusz Guzik <[email protected]>
---
fs/xfs/xfs_icache.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index 0953163a2d84..d31a2c1ac00a 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -314,6 +314,7 @@ xfs_reinit_inode(
dev_t dev = inode->i_rdev;
kuid_t uid = inode->i_uid;
kgid_t gid = inode->i_gid;
+ unsigned long state = inode->i_state;
error = inode_init_always(mp->m_super, inode);
@@ -324,6 +325,7 @@ xfs_reinit_inode(
inode->i_rdev = dev;
inode->i_uid = uid;
inode->i_gid = gid;
+ inode->i_state = state;
mapping_set_large_folios(inode->i_mapping);
return error;
}
--
2.43.0
inode_init_always started setting the field to 0.
Signed-off-by: Mateusz Guzik <[email protected]>
---
fs/xfs/xfs_icache.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index d31a2c1ac00a..088ac200b026 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -86,9 +86,8 @@ xfs_inode_alloc(
return NULL;
}
- /* VFS doesn't initialise i_mode or i_state! */
+ /* VFS doesn't initialise i_mode! */
VFS_I(ip)->i_mode = 0;
- VFS_I(ip)->i_state = 0;
mapping_set_large_folios(VFS_I(ip)->i_mapping);
XFS_STATS_INC(mp, vn_active);
--
2.43.0
inode_init_always started setting the field to 0.
Signed-off-by: Mateusz Guzik <[email protected]>
---
fs/bcachefs/fs.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/fs/bcachefs/fs.c b/fs/bcachefs/fs.c
index 514bf83ebe29..f9044da417ac 100644
--- a/fs/bcachefs/fs.c
+++ b/fs/bcachefs/fs.c
@@ -230,7 +230,6 @@ static struct bch_inode_info *__bch2_new_inode(struct bch_fs *c)
two_state_lock_init(&inode->ei_pagecache_lock);
INIT_LIST_HEAD(&inode->ei_vfs_inode_list);
mutex_init(&inode->ei_quota_lock);
- inode->v.i_state = 0;
if (unlikely(inode_init_always(c->vfs_sb, &inode->v))) {
kmem_cache_free(bch2_inode_cache, inode);
--
2.43.0
new_inode used to have the following:
spin_lock(&inode_lock);
inodes_stat.nr_inodes++;
list_add(&inode->i_list, &inode_in_use);
list_add(&inode->i_sb_list, &sb->s_inodes);
inode->i_ino = ++last_ino;
inode->i_state = 0;
spin_unlock(&inode_lock);
over time things disappeared, got moved around or got replaced (global
inode lock with a per-inode lock), eventually this got reduced to:
spin_lock(&inode->i_lock);
inode->i_state = 0;
spin_unlock(&inode->i_lock);
But the lock acquire here does not synchronize against anyone.
Additionally iget5_locked performs i_state = 0 assignment without any
locks to begin with, the two combined look confusing at best.
It looks like the current state is a leftover which was not cleaned up.
Ideally it would be an invariant that i_state == 0 to begin with, but
achieving that would require dealing with all filesystem alloc handlers
one by one.
In the meantime drop the misleading locking and move i_state zeroing to
inode_init_always so that others don't need to deal with it by hand.
Signed-off-by: Mateusz Guzik <[email protected]>
---
fs/inode.c | 13 +++----------
1 file changed, 3 insertions(+), 10 deletions(-)
diff --git a/fs/inode.c b/fs/inode.c
index 3a4c67bfe085..8f05d79de01d 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -231,6 +231,8 @@ int inode_init_always(struct super_block *sb, struct inode *inode)
if (unlikely(security_inode_alloc(inode)))
return -ENOMEM;
+
+ inode->i_state = 0;
this_cpu_inc(nr_inodes);
return 0;
@@ -1023,14 +1025,7 @@ EXPORT_SYMBOL(get_next_ino);
*/
struct inode *new_inode_pseudo(struct super_block *sb)
{
- struct inode *inode = alloc_inode(sb);
-
- if (inode) {
- spin_lock(&inode->i_lock);
- inode->i_state = 0;
- spin_unlock(&inode->i_lock);
- }
- return inode;
+ return alloc_inode(sb);
}
/**
@@ -1254,7 +1249,6 @@ struct inode *iget5_locked(struct super_block *sb, unsigned long hashval,
struct inode *new = alloc_inode(sb);
if (new) {
- new->i_state = 0;
inode = inode_insert5(new, hashval, test, set, data);
if (unlikely(inode != new))
destroy_inode(new);
@@ -1285,7 +1279,6 @@ struct inode *iget5_locked_rcu(struct super_block *sb, unsigned long hashval,
struct inode *new = alloc_inode(sb);
if (new) {
- new->i_state = 0;
inode = inode_insert5(new, hashval, test, set, data);
if (unlikely(inode != new))
destroy_inode(new);
--
2.43.0
On Tue, Jun 11, 2024 at 02:06:26PM GMT, Mateusz Guzik wrote:
> inode_init_always started setting the field to 0.
>
> Signed-off-by: Mateusz Guzik <[email protected]>
Acked-by: Kent Overstreet <[email protected]>
On Tue 11-06-24 14:06:24, Mateusz Guzik wrote:
> new_inode used to have the following:
> spin_lock(&inode_lock);
> inodes_stat.nr_inodes++;
> list_add(&inode->i_list, &inode_in_use);
> list_add(&inode->i_sb_list, &sb->s_inodes);
> inode->i_ino = ++last_ino;
> inode->i_state = 0;
> spin_unlock(&inode_lock);
>
> over time things disappeared, got moved around or got replaced (global
> inode lock with a per-inode lock), eventually this got reduced to:
> spin_lock(&inode->i_lock);
> inode->i_state = 0;
> spin_unlock(&inode->i_lock);
>
> But the lock acquire here does not synchronize against anyone.
>
> Additionally iget5_locked performs i_state = 0 assignment without any
> locks to begin with, the two combined look confusing at best.
>
> It looks like the current state is a leftover which was not cleaned up.
>
> Ideally it would be an invariant that i_state == 0 to begin with, but
> achieving that would require dealing with all filesystem alloc handlers
> one by one.
>
> In the meantime drop the misleading locking and move i_state zeroing to
> inode_init_always so that others don't need to deal with it by hand.
>
> Signed-off-by: Mateusz Guzik <[email protected]>
Just one nit below:
> diff --git a/fs/inode.c b/fs/inode.c
> index 3a4c67bfe085..8f05d79de01d 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -231,6 +231,8 @@ int inode_init_always(struct super_block *sb, struct inode *inode)
>
> if (unlikely(security_inode_alloc(inode)))
> return -ENOMEM;
> +
> + inode->i_state = 0;
> this_cpu_inc(nr_inodes);
This would be more logical above where inode content is initialized (and
less errorprone just in case security_inode_alloc() grows dependency on
i_state value) - like just after:
inode->i_flags = 0;
With that fixed feel free to add:
Reviewed-by: Jan Kara <[email protected]>
Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR
On Tue, 11 Jun 2024 14:06:22 +0200, Mateusz Guzik wrote:
> As requested by Jan this is a 4-part series.
>
> I diffed this against fs-next + my inode hash patch v3 as it adds one
> i_state = 0 case. Should that hash thing not be accepted this bit is
> trivially droppable from the patch.
>
> Mateusz Guzik (4):
> xfs: preserve i_state around inode_init_always in xfs_reinit_inode
> vfs: partially sanitize i_state zeroing on inode creation
> xfs: remove now spurious i_state initialization in xfs_inode_alloc
> bcachefs: remove now spurious i_state initialization
>
> [...]
Applied to the vfs.inode.rcu branch of the vfs/vfs.git tree.
Patches in the vfs.inode.rcu branch should appear in linux-next soon.
Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.
It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.
Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.
tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs.inode.rcu
[1/4] xfs: preserve i_state around inode_init_always in xfs_reinit_inode
https://git.kernel.org/vfs/vfs/c/f6f496712632
[2/4] vfs: partially sanitize i_state zeroing on inode creation
https://git.kernel.org/vfs/vfs/c/1fddfb5628e4
[3/4] xfs: remove now spurious i_state initialization in xfs_inode_alloc
https://git.kernel.org/vfs/vfs/c/c0a6bf1d02d8
[4/4] bcachefs: remove now spurious i_state initialization
https://git.kernel.org/vfs/vfs/c/9ed6c60e6053
> This would be more logical above where inode content is initialized (and
> less errorprone just in case security_inode_alloc() grows dependency on
> i_state value) - like just after:
>
> inode->i_flags = 0;
Fixed that in-tree. Thanks!