Hello,
Eric reported a problem to me where we were clearing SB_I_VERSION on remount of
a btrfs file system. After digging through I discovered it's because we expect
the proper flags that we want to be passed in via the mount() syscall, and
because we didn't have "iversion" in our show_options entry the mount binary
(form util-linux) wasn't setting MS_I_VERSION for the remount, and thus the VFS
was clearing SB_I_VERSION from our s_flags.
No big deal, I'll fix show_mount. Except Eric then noticed that mount -o
noiversion didn't do anything, we still get iversion set. That's because btrfs
just defaults to having SB_I_VERSION set. Furthermore -o noiversion doesn't get
sent into mount, it's handled by the mount binary itself, and it does this by
not having MS_I_VERSION set in the mount flags.
This happens as well for -o relatime, it's the default and so if you do mount -o
norelatime it won't do anything, you still get relatime behavior. The only time
this changes is if you do mount -o remount,norelatime.
So my question is, what do we do here? I know Christoph has the strong opinion
that we just don't expose I_VERSION at all, which frankly I'm ok with. However
more what I'm asking is what do we do with these weird inverted flags that we
all just kind of ignore on mount? The current setup is just broken if we want
to allow overriding the defaults at mount time. Are we ok with it just being
broken? Are we ok with things like mount -o noiversion not working because the
file system has decided that I_VERSION (or relatime) is the default, and we're
going to ignore what the user asks for unless we're remounting? Thanks,
Josef
On 7/29/20 11:32 AM, Josef Bacik wrote:
> Hello,
>
> Eric reported a problem to me where we were clearing SB_I_VERSION on remount of a btrfs file system. After digging through I discovered it's because we expect the proper flags that we want to be passed in via the mount() syscall, and because we didn't have "iversion" in our show_options entry the mount binary (form util-linux) wasn't setting MS_I_VERSION for the remount, and thus the VFS was clearing SB_I_VERSION from our s_flags.
>
> No big deal, I'll fix show_mount. Except Eric then noticed that mount -o noiversion didn't do anything, we still get iversion set. That's because btrfs just defaults to having SB_I_VERSION set. Furthermore -o noiversion doesn't get sent into mount, it's handled by the mount binary itself, and it does this by not having MS_I_VERSION set in the mount flags.
This was beaten^Wdiscussed to death in an earlier thread,
[PATCH] fs: i_version mntopt gets visible through /proc/mounts
https://lore.kernel.org/linux-fsdevel/[email protected]/
tl;dr: hch doesn't think [no]iversion should be exposed as an option /at all/
so exposing it in /proc/mounts in show_mnt_opts for mount(8)'s benefit was
nacked.
> This happens as well for -o relatime, it's the default and so if you do mount -o norelatime it won't do anything, you still get relatime behavior.
I think that's a different issue.
> The only time this changes is if you do mount -o remount,norelatime.
Hm, not on xfs:
# mount -o loop,norelatime xfsfile mnt
# grep loop /proc/mounts
/dev/loop0 /tmp/mnt xfs rw,seclabel,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0
# mount -o remount,norelatime mnt
# grep loop /proc/mounts
/dev/loop0 /tmp/mnt xfs rw,seclabel,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0
Here, "norelatime" only makes the mount binary omit the MS_RELATIME flag.
The only way to override relatime behavior is mount -o strictatime, AFAICT.
IOWS "norelatime" and "strictatime" are the same (right?); perhaps
mount -o norelatime should set the MS_STRICTATIME flag.
> So my question is, what do we do here? I know Christoph has the strong opinion that we just don't expose I_VERSION at all, which frankly I'm ok with. However more what I'm asking is what do we do with these weird inverted flags that we all just kind of ignore on mount? The current setup is just broken if we want to allow overriding the defaults at mount time. Are we ok with it just being broken? Are we ok with things like mount -o noiversion not working because the file system has decided that I_VERSION (or relatime) is the default, and we're going to ignore what the user asks for unless we're remounting? Thanks,
Are there other oddities besides iversion and relatime?
-Eric
On 7/29/20 2:41 PM, Eric Sandeen wrote:
> On 7/29/20 11:32 AM, Josef Bacik wrote:
>> Hello,
>>
>> Eric reported a problem to me where we were clearing SB_I_VERSION on remount of a btrfs file system. After digging through I discovered it's because we expect the proper flags that we want to be passed in via the mount() syscall, and because we didn't have "iversion" in our show_options entry the mount binary (form util-linux) wasn't setting MS_I_VERSION for the remount, and thus the VFS was clearing SB_I_VERSION from our s_flags.
>>
>> No big deal, I'll fix show_mount. Except Eric then noticed that mount -o noiversion didn't do anything, we still get iversion set. That's because btrfs just defaults to having SB_I_VERSION set. Furthermore -o noiversion doesn't get sent into mount, it's handled by the mount binary itself, and it does this by not having MS_I_VERSION set in the mount flags.
>
> This was beaten^Wdiscussed to death in an earlier thread,
> [PATCH] fs: i_version mntopt gets visible through /proc/mounts
>
> https://lore.kernel.org/linux-fsdevel/[email protected]/
>
> tl;dr: hch doesn't think [no]iversion should be exposed as an option /at all/
> so exposing it in /proc/mounts in show_mnt_opts for mount(8)'s benefit was
> nacked.
>
>> This happens as well for -o relatime, it's the default and so if you do mount -o norelatime it won't do anything, you still get relatime behavior.
>
> I think that's a different issue.
>
>> The only time this changes is if you do mount -o remount,norelatime.
>
> Hm, not on xfs:
>
> # mount -o loop,norelatime xfsfile mnt
> # grep loop /proc/mounts
> /dev/loop0 /tmp/mnt xfs rw,seclabel,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0
>
> # mount -o remount,norelatime mnt
> # grep loop /proc/mounts
> /dev/loop0 /tmp/mnt xfs rw,seclabel,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0
>
Oops you're right, I'm blind. Same happens for btrfs, so using -o norelatime
simply does nothing because it's considered a kernel wide default.
>
> Are there other oddities besides iversion and relatime?
It doesn't look like it, I checked a few others of the MS_INVERT variety, these
appear to be the only ones. I really don't want to have this discussion again
in the future tho when we introduce MS_SOME_NEW_AWESOME. Thanks,
Josef
Josef Bacik <[email protected]> wrote:
> So my question is, what do we do here?
Hmmm... As the code stands, MS_RDONLY, MS_SYNCHRONOUS, MS_MANDLOCK,
MS_I_VERSION and MS_LAZYTIME should all be masked off before the new flags are
set if called from mount(2) rather than fsconfig(2).
do_remount() gives MS_RMT_MASK to fs_context_for_reconfigure() to load into
fc->sb_flags_mask, which should achieve the desired effect in
reconfigure_super() on this line:
WRITE_ONCE(sb->s_flags, ((sb->s_flags & ~fc->sb_flags_mask) |
(fc->sb_flags & fc->sb_flags_mask)));
David
David Howells <[email protected]> wrote:
> > So my question is, what do we do here?
>
> Hmmm... As the code stands, MS_RDONLY, MS_SYNCHRONOUS, MS_MANDLOCK,
> MS_I_VERSION and MS_LAZYTIME should all be masked off before the new flags are
> set if called from mount(2) rather than fsconfig(2).
>
> do_remount() gives MS_RMT_MASK to fs_context_for_reconfigure() to load into
> fc->sb_flags_mask, which should achieve the desired effect in
> reconfigure_super() on this line:
>
> WRITE_ONCE(sb->s_flags, ((sb->s_flags & ~fc->sb_flags_mask) |
> (fc->sb_flags & fc->sb_flags_mask)));
So applying the attached patch and then doing:
mount -t tmpfs none /mnt
mount -o remount,iversion /mnt
mount -o remount,noiversion /mnt
mount -o remount,norelatime /mnt
mount -o remount,relatime /mnt
prints:
sb=70010000 set=800000 mask=2800051
sb=70810000 set=0 mask=2800051
sb=70010000 set=0 mask=2800051
sb=70010000 set=0 mask=2800051
MS_RELATIME isn't included in MS_RMT_MASK, so remount shouldn't be able to
change it.
David
---
diff --git a/fs/super.c b/fs/super.c
index 904459b35119..540cb37c11e7 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -964,6 +964,7 @@ int reconfigure_super(struct fs_context *fc)
}
}
+ printk("sb=%lx set=%x mask=%x\n", sb->s_flags, fc->sb_flags, fc->sb_flags_mask);
WRITE_ONCE(sb->s_flags, ((sb->s_flags & ~fc->sb_flags_mask) |
(fc->sb_flags & fc->sb_flags_mask)));
/* Needs to be ordered wrt mnt_is_readonly() */