2023-09-13 04:20:59

by Dave Chinner

[permalink] [raw]
Subject: Re: [PATCH v2] xfs: introduce protection for drop nlink

On Mon, Sep 11, 2023 at 04:12:56PM +0800, [email protected] wrote:
> From: Cheng Lin <[email protected]>
>
> When abnormal drop_nlink are detected on the inode,
> shutdown filesystem, to avoid corruption propagation.
>
> Signed-off-by: Cheng Lin <[email protected]>
> ---
> fs/xfs/xfs_fsops.c | 3 +++
> fs/xfs/xfs_inode.c | 9 +++++++++
> fs/xfs/xfs_mount.h | 1 +
> 3 files changed, 13 insertions(+)
>
> diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
> index 7cb75cb6b..6fc1cfe83 100644
> --- a/fs/xfs/xfs_fsops.c
> +++ b/fs/xfs/xfs_fsops.c
> @@ -543,6 +543,9 @@ xfs_do_force_shutdown(
> } else if (flags & SHUTDOWN_CORRUPT_ONDISK) {
> tag = XFS_PTAG_SHUTDOWN_CORRUPT;
> why = "Corruption of on-disk metadata";
> + } else if (flags & SHUTDOWN_CORRRUPT_ABN) {
> + tag = XFS_PTAG_SHUTDOWN_CORRUPT;
> + why = "Corruption of Abnormal conditions";

We don't need a new shutdown tag. We can consider this in-memory
corruption because we detected it in memory before it went to disk
(SHUTDOWN_CORRUPT_INCORE) or even on-disk corruption because the
reference count on disk is likely wrong at this point......

> } else if (flags & SHUTDOWN_DEVICE_REMOVED) {
> tag = XFS_PTAG_SHUTDOWN_IOERROR;
> why = "Block device removal";
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index 9e62cc500..2d41f2461 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -919,6 +919,15 @@ xfs_droplink(
> xfs_trans_t *tp,
> xfs_inode_t *ip)
> {
> +
> + if (VFS_I(ip)->i_nlink == 0) {
> + xfs_alert(ip->i_mount,
> + "%s: Deleting inode %llu with no links.",
> + __func__, ip->i_ino);
> + xfs_force_shutdown(ip->i_mount, SHUTDOWN_CORRRUPT_ABN);
> + return -EFSCORRUPTED;
> + }
> +
> xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG);
>
> drop_nlink(VFS_I(ip));

I'd kind of prefer that drop_nlink() be made to return an error on
underrun - if it's important enough to drop a warning in the log and
potentially panic the kernel, it's important enough to tell the
filesystem an underrun has occurred. But that opens a whole new can
of worms, so I think this will be fine.

Note that we don't actually need a call to shut the filesystem down.
Simply returning -EFSCORRUPTED will result in the filesystem being
shut down if the transaction is dirty when it gets cancelled due to
the droplink error.

Cheers,

Dave.
--
Dave Chinner
[email protected]


2023-09-13 10:19:37

by Cheng Lin

[permalink] [raw]
Subject: Re: [PATCH v2] xfs: introduce protection for drop nlink

> On Mon, Sep 11, 2023 at 04:12:56PM +0800, [email protected] wrote:
> > From: Cheng Lin <[email protected]>
> >
> > When abnormal drop_nlink are detected on the inode,
> > shutdown filesystem, to avoid corruption propagation.
> >
> > Signed-off-by: Cheng Lin <[email protected]>
> > ---
> > fs/xfs/xfs_fsops.c | 3 +++
> > fs/xfs/xfs_inode.c | 9 +++++++++
> > fs/xfs/xfs_mount.h | 1 +
> > 3 files changed, 13 insertions(+)
> >
> > diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
> > index 7cb75cb6b..6fc1cfe83 100644
> > --- a/fs/xfs/xfs_fsops.c
> > +++ b/fs/xfs/xfs_fsops.c
> > @@ -543,6 +543,9 @@ xfs_do_force_shutdown(
> > } else if (flags & SHUTDOWN_CORRUPT_ONDISK) {
> > tag = XFS_PTAG_SHUTDOWN_CORRUPT;
> > why = "Corruption of on-disk metadata";
> > + } else if (flags & SHUTDOWN_CORRRUPT_ABN) {
> > + tag = XFS_PTAG_SHUTDOWN_CORRUPT;
> > + why = "Corruption of Abnormal conditions";
> We don't need a new shutdown tag. We can consider this in-memory
> corruption because we detected it in memory before it went to disk
> (SHUTDOWN_CORRUPT_INCORE) or even on-disk corruption because the
> reference count on disk is likely wrong at this point......
> > } else if (flags & SHUTDOWN_DEVICE_REMOVED) {
> > tag = XFS_PTAG_SHUTDOWN_IOERROR;
> > why = "Block device removal";
> > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> > index 9e62cc500..2d41f2461 100644
> > --- a/fs/xfs/xfs_inode.c
> > +++ b/fs/xfs/xfs_inode.c
> > @@ -919,6 +919,15 @@ xfs_droplink(
> > xfs_trans_t *tp,
> > xfs_inode_t *ip)
> > {
> > +
> > + if (VFS_I(ip)->i_nlink == 0) {
> > + xfs_alert(ip->i_mount,
> > + "%s: Deleting inode %llu with no links.",
> > + __func__, ip->i_ino);
> > + xfs_force_shutdown(ip->i_mount, SHUTDOWN_CORRRUPT_ABN);
> > + return -EFSCORRUPTED;
> > + }
> > +
> > xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG);
> >
> > drop_nlink(VFS_I(ip));
> I'd kind of prefer that drop_nlink() be made to return an error on
> underrun - if it's important enough to drop a warning in the log and
> potentially panic the kernel, it's important enough to tell the
> filesystem an underrun has occurred. But that opens a whole new can
> of worms, so I think this will be fine.
In VFS, (drop\clear\set\inc)_nlink() all return void.
Is it appropriate, if let them return an error instead of WARN_ON?
> Note that we don't actually need a call to shut the filesystem down.
> Simply returning -EFSCORRUPTED will result in the filesystem being
> shut down if the transaction is dirty when it gets cancelled due to
> the droplink error.
> Cheers,
> Dave.
> --
> Dave Chinner
> [email protected]