Hi all,
The chattr(1) manpage has this to say about the immutable bit that
system administrators can set on files:
"A file with the 'i' attribute cannot be modified: it cannot be deleted
or renamed, no link can be created to this file, most of the file's
metadata can not be modified, and the file can not be opened in write
mode."
Given the clause about how the file 'cannot be modified', it is
surprising that programs holding writable file descriptors can continue
to write to and truncate files after the immutable flag has been set,
but they cannot call other things such as utimes, fallocate, unlink,
link, setxattr, or reflink.
Since the immutable flag is only settable by administrators, resolve
this inconsistent behavior in favor of the documented behavior -- once
the flag is set, the file cannot be modified, period.
If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.
This is an extraordinary way to destroy everything. Enjoy!
Comments and questions are, as always, welcome.
--D
kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=immutable-files
fstests git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=immutable-files
From: Darrick J. Wong <[email protected]>
We passed an inode into xfs_ioctl_setattr_get_trans with join_flags
indicating which locks are held on that inode. If we can't allocate a
transaction then we need to unlock the inode before we bail out.
Signed-off-by: Darrick J. Wong <[email protected]>
---
fs/xfs/xfs_ioctl.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index ae615a79b266..21d6f433c375 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1153,7 +1153,7 @@ xfs_ioctl_setattr_get_trans(
error = xfs_trans_alloc(mp, &M_RES(mp)->tr_ichange, 0, 0, 0, &tp);
if (error)
- return ERR_PTR(error);
+ goto out_unlock;
xfs_ilock(ip, XFS_ILOCK_EXCL);
xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL | join_flags);
From: Darrick J. Wong <[email protected]>
The chattr manpage has this to say about immutable files:
"A file with the 'i' attribute cannot be modified: it cannot be deleted
or renamed, no link can be created to this file, most of the file's
metadata can not be modified, and the file can not be opened in write
mode."
This means that we need to flush the page cache when setting the
immutable flag so that all mappings will become read-only again and
therefore programs cannot continue to write to writable mappings.
Signed-off-by: Darrick J. Wong <[email protected]>
---
fs/xfs/xfs_ioctl.c | 52 +++++++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 45 insertions(+), 7 deletions(-)
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 21d6f433c375..de35cf4469f6 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1009,6 +1009,31 @@ xfs_diflags_to_linux(
#endif
}
+/*
+ * Lock the inode against file io and page faults, then flush all dirty pages
+ * and wait for writeback and direct IO operations to finish. Returns with
+ * the relevant inode lock flags set in @join_flags. Caller is responsible for
+ * unlocking even on error return.
+ */
+static int
+xfs_ioctl_setattr_flush(
+ struct xfs_inode *ip,
+ int *join_flags)
+{
+ struct inode *inode = VFS_I(ip);
+
+ /* Already locked the inode from IO? Assume we're done. */
+ if (((*join_flags) & (XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL)) ==
+ (XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL))
+ return 0;
+
+ /* Lock and flush all mappings and IO in preparation for flag change */
+ *join_flags = XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL;
+ xfs_ilock(ip, *join_flags);
+ inode_dio_wait(inode);
+ return filemap_write_and_wait(inode->i_mapping);
+}
+
static int
xfs_ioctl_setattr_xflags(
struct xfs_trans *tp,
@@ -1103,25 +1128,22 @@ xfs_ioctl_setattr_dax_invalidate(
if (!(fa->fsx_xflags & FS_XFLAG_DAX) && !IS_DAX(inode))
return 0;
- if (S_ISDIR(inode->i_mode))
+ if (!S_ISREG(inode->i_mode))
return 0;
/* lock, flush and invalidate mapping in preparation for flag change */
- xfs_ilock(ip, XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL);
- error = filemap_write_and_wait(inode->i_mapping);
+ error = xfs_ioctl_setattr_flush(ip, join_flags);
if (error)
goto out_unlock;
error = invalidate_inode_pages2(inode->i_mapping);
if (error)
goto out_unlock;
-
- *join_flags = XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL;
return 0;
out_unlock:
- xfs_iunlock(ip, XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL);
+ xfs_iunlock(ip, *join_flags);
+ *join_flags = 0;
return error;
-
}
/*
@@ -1367,6 +1389,22 @@ xfs_ioctl_setattr(
if (code)
goto error_free_dquots;
+ /*
+ * Wait for all pending directio and then flush all the dirty pages
+ * for this file. The flush marks all the pages readonly, so any
+ * subsequent attempt to write to the file (particularly mmap pages)
+ * will come through the filesystem and fail.
+ */
+ if (S_ISREG(VFS_I(ip)->i_mode) && !IS_IMMUTABLE(VFS_I(ip)) &&
+ (fa->fsx_xflags & FS_XFLAG_IMMUTABLE)) {
+ code = xfs_ioctl_setattr_flush(ip, &join_flags);
+ if (code) {
+ xfs_iunlock(ip, join_flags);
+ join_flags = 0;
+ goto error_free_dquots;
+ }
+ }
+
tp = xfs_ioctl_setattr_get_trans(ip, join_flags);
if (IS_ERR(tp)) {
code = PTR_ERR(tp);
From: Darrick J. Wong <[email protected]>
Refactor the SETFLAGS implementation to use the SETXATTR code directly
instead of partially constructing a struct fsxattr and calling bits and
pieces of the setxattr code. This reduces code size and becomes
necessary in the next patch to maintain the behavior of allowing
userspace to set immutable on an immutable file so long as nothing
/else/ about the attributes change.
Signed-off-by: Darrick J. Wong <[email protected]>
---
fs/xfs/xfs_ioctl.c | 76 ++++++++++++++++++++--------------------------------
1 file changed, 29 insertions(+), 47 deletions(-)
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index de35cf4469f6..fdabf47532ed 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -882,38 +882,46 @@ xfs_di2lxflags(
return flags;
}
-STATIC int
-xfs_ioc_fsgetxattr(
- xfs_inode_t *ip,
- int attr,
- void __user *arg)
+static inline void
+xfs_inode_getfsxattr(
+ struct xfs_inode *ip,
+ bool attr,
+ struct fsxattr *fa)
{
- struct fsxattr fa;
-
- memset(&fa, 0, sizeof(struct fsxattr));
+ memset(fa, 0, sizeof(struct fsxattr));
xfs_ilock(ip, XFS_ILOCK_SHARED);
- fa.fsx_xflags = xfs_ip2xflags(ip);
- fa.fsx_extsize = ip->i_d.di_extsize << ip->i_mount->m_sb.sb_blocklog;
- fa.fsx_cowextsize = ip->i_d.di_cowextsize <<
- ip->i_mount->m_sb.sb_blocklog;
- fa.fsx_projid = xfs_get_projid(ip);
+ fa->fsx_xflags = xfs_ip2xflags(ip);
+ fa->fsx_extsize = XFS_FSB_TO_B(ip->i_mount, ip->i_d.di_extsize);
+ fa->fsx_cowextsize = XFS_FSB_TO_B(ip->i_mount, ip->i_d.di_cowextsize);
+ fa->fsx_projid = xfs_get_projid(ip);
if (attr) {
if (ip->i_afp) {
if (ip->i_afp->if_flags & XFS_IFEXTENTS)
- fa.fsx_nextents = xfs_iext_count(ip->i_afp);
+ fa->fsx_nextents = xfs_iext_count(ip->i_afp);
else
- fa.fsx_nextents = ip->i_d.di_anextents;
+ fa->fsx_nextents = ip->i_d.di_anextents;
} else
- fa.fsx_nextents = 0;
+ fa->fsx_nextents = 0;
} else {
if (ip->i_df.if_flags & XFS_IFEXTENTS)
- fa.fsx_nextents = xfs_iext_count(&ip->i_df);
+ fa->fsx_nextents = xfs_iext_count(&ip->i_df);
else
- fa.fsx_nextents = ip->i_d.di_nextents;
+ fa->fsx_nextents = ip->i_d.di_nextents;
}
xfs_iunlock(ip, XFS_ILOCK_SHARED);
+}
+
+STATIC int
+xfs_ioc_fsgetxattr(
+ xfs_inode_t *ip,
+ int attr,
+ void __user *arg)
+{
+ struct fsxattr fa;
+
+ xfs_inode_getfsxattr(ip, attr, &fa);
if (copy_to_user(arg, &fa, sizeof(fa)))
return -EFAULT;
@@ -1528,10 +1536,8 @@ xfs_ioc_setxflags(
struct file *filp,
void __user *arg)
{
- struct xfs_trans *tp;
struct fsxattr fa;
unsigned int flags;
- int join_flags = 0;
int error;
if (copy_from_user(&flags, arg, sizeof(flags)))
@@ -1542,37 +1548,13 @@ xfs_ioc_setxflags(
FS_SYNC_FL))
return -EOPNOTSUPP;
- fa.fsx_xflags = xfs_merge_ioc_xflags(flags, xfs_ip2xflags(ip));
+ xfs_inode_getfsxattr(ip, false, &fa);
+ fa.fsx_xflags = xfs_merge_ioc_xflags(flags, fa.fsx_xflags);
error = mnt_want_write_file(filp);
if (error)
return error;
-
- /*
- * Changing DAX config may require inode locking for mapping
- * invalidation. These need to be held all the way to transaction commit
- * or cancel time, so need to be passed through to
- * xfs_ioctl_setattr_get_trans() so it can apply them to the join call
- * appropriately.
- */
- error = xfs_ioctl_setattr_dax_invalidate(ip, &fa, &join_flags);
- if (error)
- goto out_drop_write;
-
- tp = xfs_ioctl_setattr_get_trans(ip, join_flags);
- if (IS_ERR(tp)) {
- error = PTR_ERR(tp);
- goto out_drop_write;
- }
-
- error = xfs_ioctl_setattr_xflags(tp, ip, &fa);
- if (error) {
- xfs_trans_cancel(tp);
- goto out_drop_write;
- }
-
- error = xfs_trans_commit(tp);
-out_drop_write:
+ error = xfs_ioctl_setattr(ip, &fa);
mnt_drop_write_file(filp);
return error;
}
From: Darrick J. Wong <[email protected]>
Clean up the calling convention since we're editing the fsxattr struct
anyway.
Signed-off-by: Darrick J. Wong <[email protected]>
---
fs/xfs/xfs_ioctl.c | 32 ++++++++++++++------------------
1 file changed, 14 insertions(+), 18 deletions(-)
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index fdabf47532ed..5862b7cead4c 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -832,35 +832,31 @@ xfs_ioc_ag_geometry(
* Linux extended inode flags interface.
*/
-STATIC unsigned int
+static inline void
xfs_merge_ioc_xflags(
- unsigned int flags,
- unsigned int start)
+ struct fsxattr *fa,
+ unsigned int flags)
{
- unsigned int xflags = start;
-
if (flags & FS_IMMUTABLE_FL)
- xflags |= FS_XFLAG_IMMUTABLE;
+ fa->fsx_xflags |= FS_XFLAG_IMMUTABLE;
else
- xflags &= ~FS_XFLAG_IMMUTABLE;
+ fa->fsx_xflags &= ~FS_XFLAG_IMMUTABLE;
if (flags & FS_APPEND_FL)
- xflags |= FS_XFLAG_APPEND;
+ fa->fsx_xflags |= FS_XFLAG_APPEND;
else
- xflags &= ~FS_XFLAG_APPEND;
+ fa->fsx_xflags &= ~FS_XFLAG_APPEND;
if (flags & FS_SYNC_FL)
- xflags |= FS_XFLAG_SYNC;
+ fa->fsx_xflags |= FS_XFLAG_SYNC;
else
- xflags &= ~FS_XFLAG_SYNC;
+ fa->fsx_xflags &= ~FS_XFLAG_SYNC;
if (flags & FS_NOATIME_FL)
- xflags |= FS_XFLAG_NOATIME;
+ fa->fsx_xflags |= FS_XFLAG_NOATIME;
else
- xflags &= ~FS_XFLAG_NOATIME;
+ fa->fsx_xflags &= ~FS_XFLAG_NOATIME;
if (flags & FS_NODUMP_FL)
- xflags |= FS_XFLAG_NODUMP;
+ fa->fsx_xflags |= FS_XFLAG_NODUMP;
else
- xflags &= ~FS_XFLAG_NODUMP;
-
- return xflags;
+ fa->fsx_xflags &= ~FS_XFLAG_NODUMP;
}
STATIC unsigned int
@@ -1549,7 +1545,7 @@ xfs_ioc_setxflags(
return -EOPNOTSUPP;
xfs_inode_getfsxattr(ip, false, &fa);
- fa.fsx_xflags = xfs_merge_ioc_xflags(flags, fa.fsx_xflags);
+ xfs_merge_ioc_xflags(&fa, flags);
error = mnt_want_write_file(filp);
if (error)
From: Darrick J. Wong <[email protected]>
The chattr manpage has this to say about immutable files:
"A file with the 'i' attribute cannot be modified: it cannot be deleted
or renamed, no link can be created to this file, most of the file's
metadata can not be modified, and the file can not be opened in write
mode."
However, we don't actually check the immutable flag in the setattr code,
which means that we can update project ids and extent size hints on
supposedly immutable files. Therefore, reject a setattr call on an
immutable file except for the case where we're trying to unset
IMMUTABLE.
Signed-off-by: Darrick J. Wong <[email protected]>
---
fs/xfs/xfs_ioctl.c | 47 +++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 45 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 5862b7cead4c..b5b50006e807 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1038,6 +1038,41 @@ xfs_ioctl_setattr_flush(
return filemap_write_and_wait(inode->i_mapping);
}
+/*
+ * If immutable is set and we are not clearing it, we're not allowed to change
+ * anything else in the inode. Don't error out if we're only trying to set
+ * immutable on an immutable file.
+ */
+static int
+xfs_ioctl_setattr_immutable(
+ struct xfs_inode *ip,
+ struct fsxattr *fa,
+ uint16_t di_flags,
+ uint64_t di_flags2)
+{
+ struct xfs_mount *mp = ip->i_mount;
+
+ if (!(ip->i_d.di_flags & XFS_DIFLAG_IMMUTABLE) ||
+ !(di_flags & XFS_DIFLAG_IMMUTABLE))
+ return 0;
+
+ if ((ip->i_d.di_flags & ~XFS_DIFLAG_IMMUTABLE) !=
+ (di_flags & ~XFS_DIFLAG_IMMUTABLE))
+ return -EPERM;
+ if (ip->i_d.di_version >= 3 && ip->i_d.di_flags2 != di_flags2)
+ return -EPERM;
+ if (xfs_get_projid(ip) != fa->fsx_projid)
+ return -EPERM;
+ if ((di_flags & (XFS_DIFLAG_EXTSIZE | XFS_DIFLAG_EXTSZINHERIT)) &&
+ ip->i_d.di_extsize != fa->fsx_extsize >> mp->m_sb.sb_blocklog)
+ return -EPERM;
+ if (ip->i_d.di_version >= 3 && (di_flags2 & XFS_DIFLAG2_COWEXTSIZE) &&
+ ip->i_d.di_cowextsize != fa->fsx_cowextsize >> mp->m_sb.sb_blocklog)
+ return -EPERM;
+
+ return 0;
+}
+
static int
xfs_ioctl_setattr_xflags(
struct xfs_trans *tp,
@@ -1045,7 +1080,9 @@ xfs_ioctl_setattr_xflags(
struct fsxattr *fa)
{
struct xfs_mount *mp = ip->i_mount;
+ uint16_t di_flags;
uint64_t di_flags2;
+ int error;
/* Can't change realtime flag if any extents are allocated. */
if ((ip->i_d.di_nextents || ip->i_delayed_blks) &&
@@ -1076,12 +1113,18 @@ xfs_ioctl_setattr_xflags(
!capable(CAP_LINUX_IMMUTABLE))
return -EPERM;
- /* diflags2 only valid for v3 inodes. */
+ /* Don't allow changes to an immutable inode. */
+ di_flags = xfs_flags2diflags(ip, fa->fsx_xflags);
di_flags2 = xfs_flags2diflags2(ip, fa->fsx_xflags);
+ error = xfs_ioctl_setattr_immutable(ip, fa, di_flags, di_flags2);
+ if (error)
+ return error;
+
+ /* diflags2 only valid for v3 inodes. */
if (di_flags2 && ip->i_d.di_version < 3)
return -EINVAL;
- ip->i_d.di_flags = xfs_flags2diflags(ip, fa->fsx_xflags);
+ ip->i_d.di_flags = di_flags;
ip->i_d.di_flags2 = di_flags2;
xfs_diflags_to_linux(ip);
From: Darrick J. Wong <[email protected]>
Don't allow any modifications to a file that's marked immutable, which
means that we have to flush all the writable pages to make the readonly
and we have to check the setattr/setflags parameters more closely.
Signed-off-by: Darrick J. Wong <[email protected]>
---
fs/btrfs/ioctl.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 47 insertions(+)
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index cd4e693406a0..632600e4be0a 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -180,6 +180,44 @@ static int check_fsflags(unsigned int flags)
return 0;
}
+/* Do all the prep work to check immutable status and all that. */
+static int btrfs_ioctl_check_immutable(struct inode *inode,
+ unsigned int fsflags,
+ const unsigned int immutable_fsflag)
+{
+ struct btrfs_inode *binode = BTRFS_I(inode);
+ int ret;
+
+ /*
+ * Wait for all pending directio and then flush all the dirty pages
+ * for this file. The flush marks all the pages readonly, so any
+ * subsequent attempt to write to the file (particularly mmap pages)
+ * will come through the filesystem and fail.
+ */
+ if (S_ISREG(inode->i_mode) && !IS_IMMUTABLE(inode) &&
+ (fsflags & immutable_fsflag)) {
+ inode_dio_wait(inode);
+ ret = filemap_write_and_wait(inode->i_mapping);
+ if (ret)
+ return ret;
+ }
+
+ /*
+ * If immutable is set and we are not clearing it, we're not allowed to
+ * change anything else in the inode. Don't error out if we're only
+ * trying to set immutable on an immutable file.
+ */
+ if (!(binode->flags & BTRFS_INODE_IMMUTABLE) ||
+ !(fsflags & immutable_fsflag))
+ return 0;
+
+ if ((binode->flags & ~BTRFS_INODE_IMMUTABLE) !=
+ (fsflags & ~immutable_fsflag))
+ return -EPERM;
+
+ return 0;
+}
+
static int btrfs_ioctl_setflags(struct file *file, void __user *arg)
{
struct inode *inode = file_inode(file);
@@ -225,6 +263,10 @@ static int btrfs_ioctl_setflags(struct file *file, void __user *arg)
}
}
+ ret = btrfs_ioctl_check_immutable(inode, fsflags, FS_IMMUTABLE_FL);
+ if (ret)
+ goto out_unlock;
+
if (fsflags & FS_SYNC_FL)
binode->flags |= BTRFS_INODE_SYNC;
else
@@ -433,6 +475,11 @@ static int btrfs_ioctl_fssetxattr(struct file *file, void __user *arg)
goto out_unlock;
}
+ ret = btrfs_ioctl_check_immutable(inode, fa.fsx_xflags,
+ FS_XFLAG_IMMUTABLE);
+ if (ret)
+ goto out_unlock;
+
if (fa.fsx_xflags & FS_XFLAG_SYNC)
binode->flags |= BTRFS_INODE_SYNC;
else
From: Darrick J. Wong <[email protected]>
Don't allow any modifications to a file that's marked immutable, which
means that we have to flush all the writable pages to make the readonly
and we have to check the setattr/setflags parameters more closely.
Signed-off-by: Darrick J. Wong <[email protected]>
---
fs/ext4/ioctl.c | 46 +++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 45 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index bab3da4f1e0d..abf3b88d5af7 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -269,6 +269,29 @@ static int uuid_is_zero(__u8 u[16])
}
#endif
+/*
+ * If immutable is set and we are not clearing it, we're not allowed to change
+ * anything else in the inode. Don't error out if we're only trying to set
+ * immutable on an immutable file.
+ */
+static int ext4_ioctl_check_immutable(struct inode *inode, __u32 new_projid,
+ unsigned int flags)
+{
+ struct ext4_inode_info *ei = EXT4_I(inode);
+ unsigned int oldflags = ei->i_flags;
+
+ if (!(oldflags & EXT4_IMMUTABLE_FL) || !(flags & EXT4_IMMUTABLE_FL))
+ return 0;
+
+ if ((oldflags & ~EXT4_IMMUTABLE_FL) != (flags & ~EXT4_IMMUTABLE_FL))
+ return -EPERM;
+ if (ext4_has_feature_project(inode->i_sb) &&
+ __kprojid_val(ei->i_projid) != new_projid)
+ return -EPERM;
+
+ return 0;
+}
+
static int ext4_ioctl_setflags(struct inode *inode,
unsigned int flags)
{
@@ -322,6 +345,20 @@ static int ext4_ioctl_setflags(struct inode *inode,
goto flags_out;
}
+ /*
+ * Wait for all pending directio and then flush all the dirty pages
+ * for this file. The flush marks all the pages readonly, so any
+ * subsequent attempt to write to the file (particularly mmap pages)
+ * will come through the filesystem and fail.
+ */
+ if (S_ISREG(inode->i_mode) && !IS_IMMUTABLE(inode) &&
+ (flags & EXT4_IMMUTABLE_FL)) {
+ inode_dio_wait(inode);
+ err = filemap_write_and_wait(inode->i_mapping);
+ if (err)
+ goto flags_out;
+ }
+
handle = ext4_journal_start(inode, EXT4_HT_INODE, 1);
if (IS_ERR(handle)) {
err = PTR_ERR(handle);
@@ -751,7 +788,11 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
return err;
inode_lock(inode);
- err = ext4_ioctl_setflags(inode, flags);
+ err = ext4_ioctl_check_immutable(inode,
+ from_kprojid(&init_user_ns, ei->i_projid),
+ flags);
+ if (!err)
+ err = ext4_ioctl_setflags(inode, flags);
inode_unlock(inode);
mnt_drop_write_file(filp);
return err;
@@ -1121,6 +1162,9 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
goto out;
flags = (ei->i_flags & ~EXT4_FL_XFLAG_VISIBLE) |
(flags & EXT4_FL_XFLAG_VISIBLE);
+ err = ext4_ioctl_check_immutable(inode, fa.fsx_projid, flags);
+ if (err)
+ goto out;
err = ext4_ioctl_setflags(inode, flags);
if (err)
goto out;
From: Darrick J. Wong <[email protected]>
The chattr manpage has this to say about immutable files:
"A file with the 'i' attribute cannot be modified: it cannot be deleted
or renamed, no link can be created to this file, most of the file's
metadata can not be modified, and the file can not be opened in write
mode."
Once the flag is set, it is enforced for quite a few file operations,
such as fallocate, fpunch, fzero, rm, touch, open, etc. However, we
don't check for immutability when doing a write(), a PROT_WRITE mmap(),
a truncate(), or a write to a previously established mmap.
If a program has an open write fd to a file that the administrator
subsequently marks immutable, the program still can change the file
contents. Weird!
The ability to write to an immutable file does not follow the manpage
promise that immutable files cannot be modified. Worse yet it's
inconsistent with the behavior of other syscalls which don't allow
modifications of immutable files.
Therefore, add the necessary checks to make the write, mmap, and
truncate behavior consistent with what the manpage says and consistent
with other syscalls on filesystems which support IMMUTABLE.
Signed-off-by: Darrick J. Wong <[email protected]>
---
fs/attr.c | 13 ++++++-------
mm/filemap.c | 3 +++
mm/memory.c | 3 +++
mm/mmap.c | 8 ++++++--
4 files changed, 18 insertions(+), 9 deletions(-)
diff --git a/fs/attr.c b/fs/attr.c
index d22e8187477f..1fcfdcc5b367 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -233,19 +233,18 @@ int notify_change(struct dentry * dentry, struct iattr * attr, struct inode **de
WARN_ON_ONCE(!inode_is_locked(inode));
- if (ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID | ATTR_TIMES_SET)) {
- if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
- return -EPERM;
- }
+ if (IS_IMMUTABLE(inode))
+ return -EPERM;
+
+ if ((ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID | ATTR_TIMES_SET)) &&
+ IS_APPEND(inode))
+ return -EPERM;
/*
* If utimes(2) and friends are called with times == NULL (or both
* times are UTIME_NOW), then we need to check for write permission
*/
if (ia_valid & ATTR_TOUCH) {
- if (IS_IMMUTABLE(inode))
- return -EPERM;
-
if (!inode_owner_or_capable(inode)) {
error = inode_permission(inode, MAY_WRITE);
if (error)
diff --git a/mm/filemap.c b/mm/filemap.c
index d78f577baef2..9fed698f4c63 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3033,6 +3033,9 @@ inline ssize_t generic_write_checks(struct kiocb *iocb, struct iov_iter *from)
loff_t count;
int ret;
+ if (IS_IMMUTABLE(inode))
+ return -EPERM;
+
if (!iov_iter_count(from))
return 0;
diff --git a/mm/memory.c b/mm/memory.c
index ab650c21bccd..dfd5eba278d6 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2149,6 +2149,9 @@ static vm_fault_t do_page_mkwrite(struct vm_fault *vmf)
vmf->flags = FAULT_FLAG_WRITE|FAULT_FLAG_MKWRITE;
+ if (vmf->vma->vm_file && IS_IMMUTABLE(file_inode(vmf->vma->vm_file)))
+ return VM_FAULT_SIGBUS;
+
ret = vmf->vma->vm_ops->page_mkwrite(vmf);
/* Restore original flags so that caller is not surprised */
vmf->flags = old_flags;
diff --git a/mm/mmap.c b/mm/mmap.c
index 41eb48d9b527..697a101bda59 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1481,8 +1481,12 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
case MAP_SHARED_VALIDATE:
if (flags & ~flags_mask)
return -EOPNOTSUPP;
- if ((prot&PROT_WRITE) && !(file->f_mode&FMODE_WRITE))
- return -EACCES;
+ if (prot & PROT_WRITE) {
+ if (!(file->f_mode & FMODE_WRITE))
+ return -EACCES;
+ if (IS_IMMUTABLE(file_inode(file)))
+ return -EPERM;
+ }
/*
* Make sure we don't allow writing to an append-only
On Wed, Apr 17, 2019 at 12:04:33PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <[email protected]>
>
> The chattr manpage has this to say about immutable files:
>
> "A file with the 'i' attribute cannot be modified: it cannot be deleted
> or renamed, no link can be created to this file, most of the file's
> metadata can not be modified, and the file can not be opened in write
> mode."
>
> Once the flag is set, it is enforced for quite a few file operations,
> such as fallocate, fpunch, fzero, rm, touch, open, etc. However, we
> don't check for immutability when doing a write(), a PROT_WRITE mmap(),
> a truncate(), or a write to a previously established mmap.
>
> If a program has an open write fd to a file that the administrator
> subsequently marks immutable, the program still can change the file
> contents. Weird!
>
> The ability to write to an immutable file does not follow the manpage
> promise that immutable files cannot be modified. Worse yet it's
> inconsistent with the behavior of other syscalls which don't allow
> modifications of immutable files.
>
> Therefore, add the necessary checks to make the write, mmap, and
> truncate behavior consistent with what the manpage says and consistent
> with other syscalls on filesystems which support IMMUTABLE.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
> ---
This mostly seems reasonable to me. I assume you'll want some mm acks. A
couple notes..
> fs/attr.c | 13 ++++++-------
> mm/filemap.c | 3 +++
> mm/memory.c | 3 +++
> mm/mmap.c | 8 ++++++--
> 4 files changed, 18 insertions(+), 9 deletions(-)
>
>
> diff --git a/fs/attr.c b/fs/attr.c
> index d22e8187477f..1fcfdcc5b367 100644
> --- a/fs/attr.c
> +++ b/fs/attr.c
> @@ -233,19 +233,18 @@ int notify_change(struct dentry * dentry, struct iattr * attr, struct inode **de
>
> WARN_ON_ONCE(!inode_is_locked(inode));
>
> - if (ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID | ATTR_TIMES_SET)) {
> - if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
> - return -EPERM;
> - }
> + if (IS_IMMUTABLE(inode))
> + return -EPERM;
> +
> + if ((ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID | ATTR_TIMES_SET)) &&
> + IS_APPEND(inode))
> + return -EPERM;
>
> /*
> * If utimes(2) and friends are called with times == NULL (or both
> * times are UTIME_NOW), then we need to check for write permission
> */
> if (ia_valid & ATTR_TOUCH) {
> - if (IS_IMMUTABLE(inode))
> - return -EPERM;
> -
> if (!inode_owner_or_capable(inode)) {
> error = inode_permission(inode, MAY_WRITE);
> if (error)
> diff --git a/mm/filemap.c b/mm/filemap.c
> index d78f577baef2..9fed698f4c63 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -3033,6 +3033,9 @@ inline ssize_t generic_write_checks(struct kiocb *iocb, struct iov_iter *from)
> loff_t count;
> int ret;
>
> + if (IS_IMMUTABLE(inode))
> + return -EPERM;
> +
> if (!iov_iter_count(from))
> return 0;
>
> diff --git a/mm/memory.c b/mm/memory.c
> index ab650c21bccd..dfd5eba278d6 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2149,6 +2149,9 @@ static vm_fault_t do_page_mkwrite(struct vm_fault *vmf)
>
> vmf->flags = FAULT_FLAG_WRITE|FAULT_FLAG_MKWRITE;
>
> + if (vmf->vma->vm_file && IS_IMMUTABLE(file_inode(vmf->vma->vm_file)))
> + return VM_FAULT_SIGBUS;
> +
I take it this depends on cleaning already dirty pages when the
immutable bit is set. That appears to be done later in the series, but I
notice it occurs at the filesystem level (presumably due to the ioctl).
That of course is fine, but it makes me wonder a bit whether we should
have a generic helper for each fs to call that does the requisite
writeback and dio wait (similar to generic_remap_file_range_prep() for
example). Thoughts?
> ret = vmf->vma->vm_ops->page_mkwrite(vmf);
> /* Restore original flags so that caller is not surprised */
> vmf->flags = old_flags;
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 41eb48d9b527..697a101bda59 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1481,8 +1481,12 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
> case MAP_SHARED_VALIDATE:
> if (flags & ~flags_mask)
> return -EOPNOTSUPP;
> - if ((prot&PROT_WRITE) && !(file->f_mode&FMODE_WRITE))
> - return -EACCES;
> + if (prot & PROT_WRITE) {
> + if (!(file->f_mode & FMODE_WRITE))
> + return -EACCES;
> + if (IS_IMMUTABLE(file_inode(file)))
> + return -EPERM;
> + }
We haven't done anything to clean up writeable mappings on marking the
inode immutable, right? It seems a little strange that we can have some
writeable mappings hang around while we can't create new ones, but
perhaps it doesn't matter if the write fault behavior is the same.
Brian
>
> /*
> * Make sure we don't allow writing to an append-only
>
On Wed, Apr 17, 2019 at 12:04:39PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <[email protected]>
>
> We passed an inode into xfs_ioctl_setattr_get_trans with join_flags
> indicating which locks are held on that inode. If we can't allocate a
> transaction then we need to unlock the inode before we bail out.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
> ---
Reviewed-by: Brian Foster <[email protected]>
> fs/xfs/xfs_ioctl.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index ae615a79b266..21d6f433c375 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -1153,7 +1153,7 @@ xfs_ioctl_setattr_get_trans(
>
> error = xfs_trans_alloc(mp, &M_RES(mp)->tr_ichange, 0, 0, 0, &tp);
> if (error)
> - return ERR_PTR(error);
> + goto out_unlock;
>
> xfs_ilock(ip, XFS_ILOCK_EXCL);
> xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL | join_flags);
>
On Wed, Apr 17, 2019 at 12:04:47PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <[email protected]>
>
> The chattr manpage has this to say about immutable files:
>
> "A file with the 'i' attribute cannot be modified: it cannot be deleted
> or renamed, no link can be created to this file, most of the file's
> metadata can not be modified, and the file can not be opened in write
> mode."
>
> This means that we need to flush the page cache when setting the
> immutable flag so that all mappings will become read-only again and
> therefore programs cannot continue to write to writable mappings.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
> ---
> fs/xfs/xfs_ioctl.c | 52 +++++++++++++++++++++++++++++++++++++++++++++-------
> 1 file changed, 45 insertions(+), 7 deletions(-)
>
>
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index 21d6f433c375..de35cf4469f6 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -1009,6 +1009,31 @@ xfs_diflags_to_linux(
> #endif
> }
>
> +/*
> + * Lock the inode against file io and page faults, then flush all dirty pages
> + * and wait for writeback and direct IO operations to finish. Returns with
> + * the relevant inode lock flags set in @join_flags. Caller is responsible for
> + * unlocking even on error return.
> + */
> +static int
> +xfs_ioctl_setattr_flush(
> + struct xfs_inode *ip,
> + int *join_flags)
> +{
> + struct inode *inode = VFS_I(ip);
> +
> + /* Already locked the inode from IO? Assume we're done. */
> + if (((*join_flags) & (XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL)) ==
> + (XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL))
> + return 0;
> +
Ok, so I take it this is because the xfs_setattr_path() can call down
into here via dax_invalidate() and then subsequently via the immutable
check. Instead of burying this down here, could we just check join_flags
!= 0 prior to the second setattr_flush() call? Otherwise this looks Ok
to me.
Brian
> + /* Lock and flush all mappings and IO in preparation for flag change */
> + *join_flags = XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL;
> + xfs_ilock(ip, *join_flags);
> + inode_dio_wait(inode);
> + return filemap_write_and_wait(inode->i_mapping);
> +}
> +
> static int
> xfs_ioctl_setattr_xflags(
> struct xfs_trans *tp,
> @@ -1103,25 +1128,22 @@ xfs_ioctl_setattr_dax_invalidate(
> if (!(fa->fsx_xflags & FS_XFLAG_DAX) && !IS_DAX(inode))
> return 0;
>
> - if (S_ISDIR(inode->i_mode))
> + if (!S_ISREG(inode->i_mode))
> return 0;
>
> /* lock, flush and invalidate mapping in preparation for flag change */
> - xfs_ilock(ip, XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL);
> - error = filemap_write_and_wait(inode->i_mapping);
> + error = xfs_ioctl_setattr_flush(ip, join_flags);
> if (error)
> goto out_unlock;
> error = invalidate_inode_pages2(inode->i_mapping);
> if (error)
> goto out_unlock;
> -
> - *join_flags = XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL;
> return 0;
>
> out_unlock:
> - xfs_iunlock(ip, XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL);
> + xfs_iunlock(ip, *join_flags);
> + *join_flags = 0;
> return error;
> -
> }
>
> /*
> @@ -1367,6 +1389,22 @@ xfs_ioctl_setattr(
> if (code)
> goto error_free_dquots;
>
> + /*
> + * Wait for all pending directio and then flush all the dirty pages
> + * for this file. The flush marks all the pages readonly, so any
> + * subsequent attempt to write to the file (particularly mmap pages)
> + * will come through the filesystem and fail.
> + */
> + if (S_ISREG(VFS_I(ip)->i_mode) && !IS_IMMUTABLE(VFS_I(ip)) &&
> + (fa->fsx_xflags & FS_XFLAG_IMMUTABLE)) {
> + code = xfs_ioctl_setattr_flush(ip, &join_flags);
> + if (code) {
> + xfs_iunlock(ip, join_flags);
> + join_flags = 0;
> + goto error_free_dquots;
> + }
> + }
> +
> tp = xfs_ioctl_setattr_get_trans(ip, join_flags);
> if (IS_ERR(tp)) {
> code = PTR_ERR(tp);
>
On Wed, Apr 17, 2019 at 12:04:26PM -0700, Darrick J. Wong wrote:
> Hi all,
>
> The chattr(1) manpage has this to say about the immutable bit that
> system administrators can set on files:
>
> "A file with the 'i' attribute cannot be modified: it cannot be deleted
> or renamed, no link can be created to this file, most of the file's
> metadata can not be modified, and the file can not be opened in write
> mode."
>
> Given the clause about how the file 'cannot be modified', it is
> surprising that programs holding writable file descriptors can continue
> to write to and truncate files after the immutable flag has been set,
> but they cannot call other things such as utimes, fallocate, unlink,
> link, setxattr, or reflink.
>
> Since the immutable flag is only settable by administrators, resolve
> this inconsistent behavior in favor of the documented behavior -- once
> the flag is set, the file cannot be modified, period.
The manual page leaves the case undefined, though the word 'modified'
can be interpreted in the same sense as 'mtime' ie. modifying the file
data. The enumerated file operations that don't work on an immutable
file suggest that it's more like the 'ctime', ie. (state) changes are
forbidden.
Tthe patchset makes some sense, but it changes the semantics a bit. From
'not changed but still modified' to 'neither changed nor modified'. It
starts to sound like a word game, but I think both are often used
interchangeably in the language. See the changelog of 1/8 where you used
them in the other meaning regarding ctime and mtime.
I personally doubt there's a real use of the undefined case, though
something artificial like 'a process opens a fd, sets up file in a very
specific way, sets immutable and hands the fd to an unprivileged
process' can be made up. The overhead of the new checks seems to be
small so performance is not the concern here.
Overall, I don't see a strong reason for either semantics. As long as
it's documented possibly with some of the corner cases described in more
detail, fine.
On Wed, Apr 17, 2019 at 12:04:33PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <[email protected]>
>
> The chattr manpage has this to say about immutable files:
>
> "A file with the 'i' attribute cannot be modified: it cannot be deleted
> or renamed, no link can be created to this file, most of the file's
> metadata can not be modified, and the file can not be opened in write
> mode."
>
> Once the flag is set, it is enforced for quite a few file operations,
> such as fallocate, fpunch, fzero, rm, touch, open, etc. However, we
> don't check for immutability when doing a write(), a PROT_WRITE mmap(),
> a truncate(), or a write to a previously established mmap.
>
> If a program has an open write fd to a file that the administrator
> subsequently marks immutable, the program still can change the file
> contents. Weird!
>
> The ability to write to an immutable file does not follow the manpage
> promise that immutable files cannot be modified. Worse yet it's
> inconsistent with the behavior of other syscalls which don't allow
> modifications of immutable files.
>
> Therefore, add the necessary checks to make the write, mmap, and
> truncate behavior consistent with what the manpage says and consistent
> with other syscalls on filesystems which support IMMUTABLE.
>
> Signed-off-by: Darrick J. Wong <[email protected]>
Thanks, looks good. I'm going to take this patch through the ext4
tree since it doesn't have any dependencies on the rest of the patch
series.
- Ted
On Wed, Apr 17, 2019 at 12:04:33PM -0700, Darrick J. Wong wrote:
> diff --git a/mm/memory.c b/mm/memory.c
> index ab650c21bccd..dfd5eba278d6 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2149,6 +2149,9 @@ static vm_fault_t do_page_mkwrite(struct vm_fault *vmf)
>
> vmf->flags = FAULT_FLAG_WRITE|FAULT_FLAG_MKWRITE;
>
> + if (vmf->vma->vm_file && IS_IMMUTABLE(file_inode(vmf->vma->vm_file)))
> + return VM_FAULT_SIGBUS;
> +
> ret = vmf->vma->vm_ops->page_mkwrite(vmf);
> /* Restore original flags so that caller is not surprised */
> vmf->flags = old_flags;
Shouldn't this check be moved before the modification of vmf->flags?
It looks like do_page_mkwrite() isn't supposed to be returning with
vmf->flags modified, lest "the caller gets surprised".
- Ted
On Sun, Jun 09, 2019 at 09:51:45PM -0400, Theodore Ts'o wrote:
> On Wed, Apr 17, 2019 at 12:04:33PM -0700, Darrick J. Wong wrote:
> > diff --git a/mm/memory.c b/mm/memory.c
> > index ab650c21bccd..dfd5eba278d6 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -2149,6 +2149,9 @@ static vm_fault_t do_page_mkwrite(struct vm_fault *vmf)
> >
> > vmf->flags = FAULT_FLAG_WRITE|FAULT_FLAG_MKWRITE;
> >
> > + if (vmf->vma->vm_file && IS_IMMUTABLE(file_inode(vmf->vma->vm_file)))
> > + return VM_FAULT_SIGBUS;
> > +
> > ret = vmf->vma->vm_ops->page_mkwrite(vmf);
> > /* Restore original flags so that caller is not surprised */
> > vmf->flags = old_flags;
>
> Shouldn't this check be moved before the modification of vmf->flags?
> It looks like do_page_mkwrite() isn't supposed to be returning with
> vmf->flags modified, lest "the caller gets surprised".
Yeah, I think that was a merge error during a rebase... :(
Er ... if you're still planning to take this patch through your tree,
can you move it to above the "vmf->flags = FAULT_FLAG_WRITE..." ?
--D
> - Ted
On Sun, Jun 09, 2019 at 09:41:44PM -0700, Darrick J. Wong wrote:
> On Sun, Jun 09, 2019 at 09:51:45PM -0400, Theodore Ts'o wrote:
> > On Wed, Apr 17, 2019 at 12:04:33PM -0700, Darrick J. Wong wrote:
>
> > Shouldn't this check be moved before the modification of vmf->flags?
> > It looks like do_page_mkwrite() isn't supposed to be returning with
> > vmf->flags modified, lest "the caller gets surprised".
>
> Yeah, I think that was a merge error during a rebase... :(
>
> Er ... if you're still planning to take this patch through your tree,
> can you move it to above the "vmf->flags = FAULT_FLAG_WRITE..." ?
I was planning on only taking 8/8 through the ext4 tree. I also added
a patch which filtered writes, truncates, and page_mkwrites (but not
mmap) for immutable files at the ext4 level.
I *could* take this patch through the mm/fs tree, but I wasn't sure
what your plans were for the rest of the patch series, and it seemed
like it hadn't gotten much review/attention from other fs or mm folks
(well, I guess Brian Foster weighed in).
What do you think?
- Ted
On Mon, Jun 10, 2019 at 09:14:17AM -0400, Theodore Ts'o wrote:
> On Sun, Jun 09, 2019 at 09:41:44PM -0700, Darrick J. Wong wrote:
> > On Sun, Jun 09, 2019 at 09:51:45PM -0400, Theodore Ts'o wrote:
> > > On Wed, Apr 17, 2019 at 12:04:33PM -0700, Darrick J. Wong wrote:
> >
> > > Shouldn't this check be moved before the modification of vmf->flags?
> > > It looks like do_page_mkwrite() isn't supposed to be returning with
> > > vmf->flags modified, lest "the caller gets surprised".
> >
> > Yeah, I think that was a merge error during a rebase... :(
> >
> > Er ... if you're still planning to take this patch through your tree,
> > can you move it to above the "vmf->flags = FAULT_FLAG_WRITE..." ?
>
> I was planning on only taking 8/8 through the ext4 tree. I also added
> a patch which filtered writes, truncates, and page_mkwrites (but not
> mmap) for immutable files at the ext4 level.
*Oh*. I saw your reply attached to the 1/8 patch and thought that was
the one you were taking. I was sort of surprised, tbh. :)
> I *could* take this patch through the mm/fs tree, but I wasn't sure
> what your plans were for the rest of the patch series, and it seemed
> like it hadn't gotten much review/attention from other fs or mm folks
> (well, I guess Brian Foster weighed in).
> What do you think?
Not sure. The comments attached to the LWN story were sort of nasty,
and now that a couple of people said "Oh, well, Debian documented the
inconsistent behavior so just let it be" I haven't felt like
resurrecting the series for 5.3.
I do want to clean up the parameter validation for the VFS SETFLAGS and
FSSETXATTR ioctls though... eh, maybe I'll just send out the series as
it stands now. I'm still maintaining it, so all that work might as well
go somewhere.
--D
>
> - Ted
>
>
>
On Mon, Jun 10, 2019 at 09:09:34AM -0700, Darrick J. Wong wrote:
> > I was planning on only taking 8/8 through the ext4 tree. I also added
> > a patch which filtered writes, truncates, and page_mkwrites (but not
> > mmap) for immutable files at the ext4 level.
>
> *Oh*. I saw your reply attached to the 1/8 patch and thought that was
> the one you were taking. I was sort of surprised, tbh. :)
Sorry, my bad. I mis-replied to the wrong e-mail message :-)
> > I *could* take this patch through the mm/fs tree, but I wasn't sure
> > what your plans were for the rest of the patch series, and it seemed
> > like it hadn't gotten much review/attention from other fs or mm folks
> > (well, I guess Brian Foster weighed in).
>
> > What do you think?
>
> Not sure. The comments attached to the LWN story were sort of nasty,
> and now that a couple of people said "Oh, well, Debian documented the
> inconsistent behavior so just let it be" I haven't felt like
> resurrecting the series for 5.3.
Ah, I had missed the LWN article. <Looks>
Yeah, it's the same set of issues that we had discussed when this
first came up. We can go round and round on this one; It's true that
root can now cause random programs which have a file mmap'ed for
writing to seg fault, but root has a million ways of killing and
otherwise harming running application programs, and it's unlikely
files get marked for immutable all that often. We just have to pick
one way of doing things, and let it be same across all the file
systems.
My understanding was that XFS had chosen to make the inode immutable
as soon as the flag is set (as opposed to forbidding new fd's to be
opened which were writeable), and I was OK moving ext4 to that common
interpretation of the immmutable bit, even though it would be a change
to ext4.
And then when I saw that Amir had included a patch that would cause
test failures unless that patch series was applied, it seemed that we
had all thought that the change was a done deal. Perhaps we should
have had a more explicit discussion when the test was sent for review,
but I had assumed it was exclusively a copy_file_range set of tests,
so I didn't realize it was going to cause ext4 failures.
- Ted
On Mon, Jun 11, 2019 at 04:41:54PM -0400, Theodore Ts'o wrote:
> On Mon, Jun 10, 2019 at 09:09:34AM -0700, Darrick J. Wong wrote:
> > > I was planning on only taking 8/8 through the ext4 tree. I also added
> > > a patch which filtered writes, truncates, and page_mkwrites (but not
> > > mmap) for immutable files at the ext4 level.
> >
> > *Oh*. I saw your reply attached to the 1/8 patch and thought that was
> > the one you were taking. I was sort of surprised, tbh. :)
>
> Sorry, my bad. I mis-replied to the wrong e-mail message :-)
>
> > > I *could* take this patch through the mm/fs tree, but I wasn't sure
> > > what your plans were for the rest of the patch series, and it seemed
> > > like it hadn't gotten much review/attention from other fs or mm folks
> > > (well, I guess Brian Foster weighed in).
> >
> > > What do you think?
> >
> > Not sure. The comments attached to the LWN story were sort of nasty,
> > and now that a couple of people said "Oh, well, Debian documented the
> > inconsistent behavior so just let it be" I haven't felt like
> > resurrecting the series for 5.3.
>
> Ah, I had missed the LWN article. <Looks>
>
> Yeah, it's the same set of issues that we had discussed when this
> first came up. We can go round and round on this one; It's true that
> root can now cause random programs which have a file mmap'ed for
> writing to seg fault, but root has a million ways of killing and
> otherwise harming running application programs, and it's unlikely
> files get marked for immutable all that often. We just have to pick
> one way of doing things, and let it be same across all the file
> systems.
>
> My understanding was that XFS had chosen to make the inode immutable
> as soon as the flag is set (as opposed to forbidding new fd's to be
> opened which were writeable), and I was OK moving ext4 to that common
> interpretation of the immmutable bit, even though it would be a change
> to ext4.
<nod> It started as "just do this to xfs" and has now become a vfs level
change...
> And then when I saw that Amir had included a patch that would cause
> test failures unless that patch series was applied, it seemed that we
> had all thought that the change was a done deal. Perhaps we should
> have had a more explicit discussion when the test was sent for review,
> but I had assumed it was exclusively a copy_file_range set of tests,
> so I didn't realize it was going to cause ext4 failures.
And here we see the inconsistent behavior causing developer confusion. :)
I think Amir's c_f_r tests just check the existing behavior (of just
c_f_r) that you can't (most of the time) copy into a file that you
opened for write but that the administrator has since marked immutable.
/That/ behavior in turn came from the original implementation that would
try reflink which would fail on the immutable destination check and then
fail the whole call ... I think?
--D
> - Ted
On Mon, Jun 10, 2019 at 04:41:54PM -0400, Theodore Ts'o wrote:
> On Mon, Jun 10, 2019 at 09:09:34AM -0700, Darrick J. Wong wrote:
> > > I was planning on only taking 8/8 through the ext4 tree. I also added
> > > a patch which filtered writes, truncates, and page_mkwrites (but not
> > > mmap) for immutable files at the ext4 level.
> >
> > *Oh*. I saw your reply attached to the 1/8 patch and thought that was
> > the one you were taking. I was sort of surprised, tbh. :)
>
> Sorry, my bad. I mis-replied to the wrong e-mail message :-)
Also ... after flailing around with the v2 series I decided that it
would be much less work to refactor all the current implementations to
call a common parameter-checking function, which will hopefully make the
behavior of SETFLAGS and FSSETXATTR more consistent across filesystems.
That makes the immutable series much less code and fewer patches, but
also means that the 8/8 patch isn't needed anymore.
I'm about to send both out.
--D
> > > I *could* take this patch through the mm/fs tree, but I wasn't sure
> > > what your plans were for the rest of the patch series, and it seemed
> > > like it hadn't gotten much review/attention from other fs or mm folks
> > > (well, I guess Brian Foster weighed in).
> >
> > > What do you think?
> >
> > Not sure. The comments attached to the LWN story were sort of nasty,
> > and now that a couple of people said "Oh, well, Debian documented the
> > inconsistent behavior so just let it be" I haven't felt like
> > resurrecting the series for 5.3.
>
> Ah, I had missed the LWN article. <Looks>
>
> Yeah, it's the same set of issues that we had discussed when this
> first came up. We can go round and round on this one; It's true that
> root can now cause random programs which have a file mmap'ed for
> writing to seg fault, but root has a million ways of killing and
> otherwise harming running application programs, and it's unlikely
> files get marked for immutable all that often. We just have to pick
> one way of doing things, and let it be same across all the file
> systems.
>
> My understanding was that XFS had chosen to make the inode immutable
> as soon as the flag is set (as opposed to forbidding new fd's to be
> opened which were writeable), and I was OK moving ext4 to that common
> interpretation of the immmutable bit, even though it would be a change
> to ext4.
>
> And then when I saw that Amir had included a patch that would cause
> test failures unless that patch series was applied, it seemed that we
> had all thought that the change was a done deal. Perhaps we should
> have had a more explicit discussion when the test was sent for review,
> but I had assumed it was exclusively a copy_file_range set of tests,
> so I didn't realize it was going to cause ext4 failures.
>
> - Ted