2022-04-13 03:52:54

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [RFC PATCH] ext4: add unmount filesystem message

On Wed, Apr 13, 2022 at 10:23:31AM +0800, Zhang Yi wrote:
> On 2022/4/13 9:35, Theodore Ts'o wrote:
> > On Tue, Apr 12, 2022 at 12:01:37PM -0400, Gabriel Krisman Bertazi wrote:
> >> Zhang Yi <[email protected]> writes:
> >>
> >>> Now that we have kernel message at mount time, system administrator
> >
> > "Now that we have...." is a bit misleading, since (at least to an
> > English speaker) that this is something that was recently added, and
> > that's not the case.
> >
> >>> could acquire the mount time, device and options easily. But we don't
> >>> have corresponding unmounting message at umount time, so we cannot know
> >>> if someone umount a filesystem easily. Some of the modern filesystems
> >>> (e.g. xfs) have the umounting kernel message, so add one for ext4
> >>> filesystem for convenience.
> >>>
> >>> EXT4-fs (sdb): mounted filesystem with ordered data mode. Quota mode: none.
> >>> EXT4-fs (sdb): unmounting filesystem.
> >>
> >> I don't think sysadmins should be relying on the kernel log for this,
> >> since the information can easily be overwritten by new messages there.
> >> Is there a reason why you can't just monitor /proc/self/mountinfo?
> >
> > You're right that it can be dangerous for sysadmins to be relying on
> > the kernel log for mount and umount notifications --- but it depends
> > on what they think it means, and the potential pitfalls are there for
> > both the mount and unmount messages. The problem of course, is that
> > bind mounts, and mount name spaces, so if the question is whether a
> > file system is available at a particular mount point, then using the
> > kernel log is definitely not going to be reliable.
> >
> > But if the goal is to determine whether a particular device is safe to
> > run fsck or otherwise access directly, or for the purposes of
> > debugging the kernel and looking at the logs to understand when the
> > device is being accessed by the kernel and when the file system is
> > done with the device, I can see how it might be useful.
> >
>
> Yes, I understand that the kernel log is not reliable, and
> /proc/self/mountinfo neither. Our goal is simple, As Ted said, just add a
> method to help sysadmins to know whether a particular ext4 device is really
> doing unmount procedure, it could be helpful for us to debug kernel and
> locate kernel bug.

But if the mount/unmount messages are ratelimited, how will you know for
sure if the ratelimiting mechanism elides the message?

--D

> Thanks,
> Yi.
>
>
>


2022-04-13 10:59:52

by Zhang Yi

[permalink] [raw]
Subject: Re: [RFC PATCH] ext4: add unmount filesystem message

On 2022/4/13 11:51, Darrick J. Wong wrote:
> On Wed, Apr 13, 2022 at 10:23:31AM +0800, Zhang Yi wrote:
>> On 2022/4/13 9:35, Theodore Ts'o wrote:
>>> On Tue, Apr 12, 2022 at 12:01:37PM -0400, Gabriel Krisman Bertazi wrote:
>>>> Zhang Yi <[email protected]> writes:
>>>>
>>>>> Now that we have kernel message at mount time, system administrator
>>>
>>> "Now that we have...." is a bit misleading, since (at least to an
>>> English speaker) that this is something that was recently added, and
>>> that's not the case.
>>>
>>>>> could acquire the mount time, device and options easily. But we don't
>>>>> have corresponding unmounting message at umount time, so we cannot know
>>>>> if someone umount a filesystem easily. Some of the modern filesystems
>>>>> (e.g. xfs) have the umounting kernel message, so add one for ext4
>>>>> filesystem for convenience.
>>>>>
>>>>> EXT4-fs (sdb): mounted filesystem with ordered data mode. Quota mode: none.
>>>>> EXT4-fs (sdb): unmounting filesystem.
>>>>
>>>> I don't think sysadmins should be relying on the kernel log for this,
>>>> since the information can easily be overwritten by new messages there.
>>>> Is there a reason why you can't just monitor /proc/self/mountinfo?
>>>
>>> You're right that it can be dangerous for sysadmins to be relying on
>>> the kernel log for mount and umount notifications --- but it depends
>>> on what they think it means, and the potential pitfalls are there for
>>> both the mount and unmount messages. The problem of course, is that
>>> bind mounts, and mount name spaces, so if the question is whether a
>>> file system is available at a particular mount point, then using the
>>> kernel log is definitely not going to be reliable.
>>>
>>> But if the goal is to determine whether a particular device is safe to
>>> run fsck or otherwise access directly, or for the purposes of
>>> debugging the kernel and looking at the logs to understand when the
>>> device is being accessed by the kernel and when the file system is
>>> done with the device, I can see how it might be useful.
>>>
>>
>> Yes, I understand that the kernel log is not reliable, and
>> /proc/self/mountinfo neither. Our goal is simple, As Ted said, just add a
>> method to help sysadmins to know whether a particular ext4 device is really
>> doing unmount procedure, it could be helpful for us to debug kernel and
>> locate kernel bug.
>
> But if the mount/unmount messages are ratelimited, how will you know for
> sure if the ratelimiting mechanism elides the message?
>

This is to be expected that the messages are ratelimited, it's just a "try best"
way to let us acquire more information, it's best if it write something down and
not surprising if not. If the messages are ratelimited will get the "...suppressed"
message and could know what happened, we will combine other logs (e.g. systemd log)
to make things clear as far as possible.

Thanks,
Yi.