2023-05-08 01:46:20

by Luis Chamberlain

[permalink] [raw]
Subject: [PATCH 0/6] vfs: provide automatic kernel freeze / resume

It's been about 5 months since the last v3 RFC for fs freeze. Now
that some of us will have time at LSFMM to discuss stuff I figured
it would be good to try to get the last pieces, if any discussed,
and put out a new patch set based on the latests feedback. This
time I boot tested, and stress tested the patches. From what I can
tell I confirm nothing regresses, we just end up now with a new world
where if your platform does support s3 / s4 we'll kick into gear the
automatic kernel freeze.

To help with testing, as this is a rather tiny bit obscure area with
virtualization, I've gone ahead and extended kdevops with support for
always enabling s3 / s4, so it should be easy to test guest bring up
there.

I've picked out using stress-ng now to have fun stress testing suspend,
the insane will try:

./stress-ng --touch 8192 --touch-method creat

Resume works but eventually suspend will trickle tons of OOMs and so
we gotta find a sweet spot to automate this somehow in fstests somehow.
I am not sure how we're gonna test this moving forward on fstests but
perhaps worth talking about at LSFMM for ideas.

Anyway, your filesystem will not participate in the auto kernel
freeze / thaw unless your filesystem gets the kthread freezer junk
removed and sets a flag. I'll post 3 patches for the 3 main filesystems
after this. I've carried and advanced the SmPL patch for a few years
now, and magically it all still works.

[0] https://lkml.kernel.org/r/[email protected]

Luis Chamberlain (6):
fs: unify locking semantics for fs freeze / thaw
fs: add frozen sb state helpers
fs: distinguish between user initiated freeze and kernel initiated
freeze
fs: move !SB_BORN check early on freeze and add for thaw
fs: add iterate_supers_excl() and iterate_supers_reverse_excl()
fs: add automatic kernel fs freeze / thaw and remove kthread freezing

block/bdev.c | 9 +-
fs/ext4/ext4_jbd2.c | 2 +-
fs/f2fs/gc.c | 9 +-
fs/gfs2/glops.c | 2 +-
fs/gfs2/super.c | 11 +-
fs/gfs2/sys.c | 12 ++-
fs/gfs2/util.c | 7 +-
fs/ioctl.c | 14 ++-
fs/quota/quota.c | 4 +-
fs/super.c | 239 +++++++++++++++++++++++++++++++++--------
fs/xfs/xfs_trans.c | 3 +-
include/linux/fs.h | 54 +++++++++-
kernel/power/process.c | 15 ++-
13 files changed, 313 insertions(+), 68 deletions(-)

--
2.39.2


2023-05-08 01:46:40

by Luis Chamberlain

[permalink] [raw]
Subject: [PATCH 1/6] fs: unify locking semantics for fs freeze / thaw

Right now freeze_super() and thaw_super() are called with
different locking contexts. To expand on this is messy, so
just unify the requirement to require grabbing an active
reference and keep the superblock locked.

Suggested-by: Christoph Hellwig <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
---
block/bdev.c | 5 +++-
fs/f2fs/gc.c | 5 ++++
fs/gfs2/super.c | 9 +++++--
fs/gfs2/sys.c | 6 +++++
fs/gfs2/util.c | 5 ++++
fs/ioctl.c | 12 ++++++++--
fs/super.c | 62 ++++++++++++++++++-------------------------------
7 files changed, 60 insertions(+), 44 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index 21c63bfef323..dc54a2a1c46e 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -251,7 +251,7 @@ int freeze_bdev(struct block_device *bdev)
error = sb->s_op->freeze_super(sb);
else
error = freeze_super(sb);
- deactivate_super(sb);
+ deactivate_locked_super(sb);

if (error) {
bdev->bd_fsfreeze_count--;
@@ -289,6 +289,8 @@ int thaw_bdev(struct block_device *bdev)
sb = bdev->bd_fsfreeze_sb;
if (!sb)
goto out;
+ if (!get_active_super(bdev))
+ goto out;

if (sb->s_op->thaw_super)
error = sb->s_op->thaw_super(sb);
@@ -298,6 +300,7 @@ int thaw_bdev(struct block_device *bdev)
bdev->bd_fsfreeze_count++;
else
bdev->bd_fsfreeze_sb = NULL;
+ deactivate_locked_super(sb);
out:
mutex_unlock(&bdev->bd_fsfreeze_mutex);
return error;
diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
index 61c5f9d26018..e31d6791d3e3 100644
--- a/fs/f2fs/gc.c
+++ b/fs/f2fs/gc.c
@@ -2166,7 +2166,10 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
if (err)
return err;

+ if (!get_active_super(sbi->sb->s_bdev))
+ return -ENOTTY;
freeze_super(sbi->sb);
+
f2fs_down_write(&sbi->gc_lock);
f2fs_down_write(&sbi->cp_global_sem);

@@ -2217,6 +2220,8 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
out_err:
f2fs_up_write(&sbi->cp_global_sem);
f2fs_up_write(&sbi->gc_lock);
+ /* We use the same active reference from freeze */
thaw_super(sbi->sb);
+ deactivate_locked_super(sbi->sb);
return err;
}
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 5eed8c237500..e57cb593e2f3 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -676,7 +676,12 @@ void gfs2_freeze_func(struct work_struct *work)
struct gfs2_sbd *sdp = container_of(work, struct gfs2_sbd, sd_freeze_work);
struct super_block *sb = sdp->sd_vfs;

- atomic_inc(&sb->s_active);
+ if (!get_active_super(sb->s_bdev)) {
+ fs_info(sdp, "GFS2: couldn't grap super for thaw for filesystem\n");
+ gfs2_assert_withdraw(sdp, 0);
+ return;
+ }
+
error = gfs2_freeze_lock(sdp, &freeze_gh, 0);
if (error) {
gfs2_assert_withdraw(sdp, 0);
@@ -690,7 +695,7 @@ void gfs2_freeze_func(struct work_struct *work)
}
gfs2_freeze_unlock(&freeze_gh);
}
- deactivate_super(sb);
+ deactivate_locked_super(sb);
clear_bit_unlock(SDF_FS_FROZEN, &sdp->sd_flags);
wake_up_bit(&sdp->sd_flags, SDF_FS_FROZEN);
return;
diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
index 454dc2ff8b5e..cbb71c3520c0 100644
--- a/fs/gfs2/sys.c
+++ b/fs/gfs2/sys.c
@@ -164,6 +164,9 @@ static ssize_t freeze_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
if (!capable(CAP_SYS_ADMIN))
return -EPERM;

+ if (!get_active_super(sb->s_bdev))
+ return -ENOTTY;
+
switch (n) {
case 0:
error = thaw_super(sdp->sd_vfs);
@@ -172,9 +175,12 @@ static ssize_t freeze_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
error = freeze_super(sdp->sd_vfs);
break;
default:
+ deactivate_locked_super(sb);
return -EINVAL;
}

+ deactivate_locked_super(sb);
+
if (error) {
fs_warn(sdp, "freeze %d error %d\n", n, error);
return error;
diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
index 7a6aeffcdf5c..3a0cd5e9ad84 100644
--- a/fs/gfs2/util.c
+++ b/fs/gfs2/util.c
@@ -345,10 +345,15 @@ int gfs2_withdraw(struct gfs2_sbd *sdp)
set_bit(SDF_WITHDRAW_IN_PROG, &sdp->sd_flags);

if (sdp->sd_args.ar_errors == GFS2_ERRORS_WITHDRAW) {
+ if (!get_active_super(sb->s_bdev)) {
+ fs_err(sdp, "could not grab super on withdraw for file system\n");
+ return -1;
+ }
fs_err(sdp, "about to withdraw this file system\n");
BUG_ON(sdp->sd_args.ar_debug);

signal_our_withdraw(sdp);
+ deactivate_locked_super(sb);

kobject_uevent(&sdp->sd_kobj, KOBJ_OFFLINE);

diff --git a/fs/ioctl.c b/fs/ioctl.c
index 5b2481cd4750..1d20af762e0d 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -386,6 +386,7 @@ static int ioctl_fioasync(unsigned int fd, struct file *filp,
static int ioctl_fsfreeze(struct file *filp)
{
struct super_block *sb = file_inode(filp)->i_sb;
+ int ret;

if (!ns_capable(sb->s_user_ns, CAP_SYS_ADMIN))
return -EPERM;
@@ -394,10 +395,17 @@ static int ioctl_fsfreeze(struct file *filp)
if (sb->s_op->freeze_fs == NULL && sb->s_op->freeze_super == NULL)
return -EOPNOTSUPP;

+ if (!get_active_super(sb->s_bdev))
+ return -ENOTTY;
+
/* Freeze */
if (sb->s_op->freeze_super)
- return sb->s_op->freeze_super(sb);
- return freeze_super(sb);
+ ret = sb->s_op->freeze_super(sb);
+ ret = freeze_super(sb);
+
+ deactivate_locked_super(sb);
+
+ return ret;
}

static int ioctl_fsthaw(struct file *filp)
diff --git a/fs/super.c b/fs/super.c
index 34afe411cf2b..0e9d48846684 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -39,8 +39,6 @@
#include <uapi/linux/mount.h>
#include "internal.h"

-static int thaw_super_locked(struct super_block *sb);
-
static LIST_HEAD(super_blocks);
static DEFINE_SPINLOCK(sb_lock);

@@ -851,13 +849,13 @@ struct super_block *get_active_super(struct block_device *bdev)
if (sb->s_bdev == bdev) {
if (!grab_super(sb))
goto restart;
- up_write(&sb->s_umount);
return sb;
}
}
spin_unlock(&sb_lock);
return NULL;
}
+EXPORT_SYMBOL_GPL(get_active_super);

struct super_block *user_get_super(dev_t dev, bool excl)
{
@@ -1024,13 +1022,13 @@ void emergency_remount(void)

static void do_thaw_all_callback(struct super_block *sb)
{
- down_write(&sb->s_umount);
+ if (!get_active_super(sb->s_bdev))
+ return;
if (sb->s_root && sb->s_flags & SB_BORN) {
emergency_thaw_bdev(sb);
- thaw_super_locked(sb);
- } else {
- up_write(&sb->s_umount);
+ thaw_super(sb);
}
+ deactivate_locked_super(sb);
}

static void do_thaw_all(struct work_struct *work)
@@ -1636,10 +1634,13 @@ static void sb_freeze_unlock(struct super_block *sb, int level)
}

/**
- * freeze_super - lock the filesystem and force it into a consistent state
+ * freeze_super - force a filesystem backed by a block device into a consistent state
* @sb: the super to lock
*
- * Syncs the super to make sure the filesystem is consistent and calls the fs's
+ * Used by filesystems and the kernel to freeze a fileystem backed by a block
+ * device into a consistent state. Callers must use get_active_super(bdev) to
+ * lock the @sb and when done must unlock it with deactivate_locked_super().
+ * Syncs the filesystem backed by the @sb and calls the filesystem's optional
* freeze_fs. Subsequent calls to this without first thawing the fs will return
* -EBUSY.
*
@@ -1672,22 +1673,15 @@ int freeze_super(struct super_block *sb)
{
int ret;

- atomic_inc(&sb->s_active);
- down_write(&sb->s_umount);
- if (sb->s_writers.frozen != SB_UNFROZEN) {
- deactivate_locked_super(sb);
+ if (sb->s_writers.frozen != SB_UNFROZEN)
return -EBUSY;
- }

- if (!(sb->s_flags & SB_BORN)) {
- up_write(&sb->s_umount);
+ if (!(sb->s_flags & SB_BORN))
return 0; /* sic - it's "nothing to do" */
- }

if (sb_rdonly(sb)) {
/* Nothing to do really... */
sb->s_writers.frozen = SB_FREEZE_COMPLETE;
- up_write(&sb->s_umount);
return 0;
}

@@ -1707,7 +1701,6 @@ int freeze_super(struct super_block *sb)
sb->s_writers.frozen = SB_UNFROZEN;
sb_freeze_unlock(sb, SB_FREEZE_PAGEFAULT);
wake_up(&sb->s_writers.wait_unfrozen);
- deactivate_locked_super(sb);
return ret;
}

@@ -1723,7 +1716,6 @@ int freeze_super(struct super_block *sb)
sb->s_writers.frozen = SB_UNFROZEN;
sb_freeze_unlock(sb, SB_FREEZE_FS);
wake_up(&sb->s_writers.wait_unfrozen);
- deactivate_locked_super(sb);
return ret;
}
}
@@ -1733,19 +1725,25 @@ int freeze_super(struct super_block *sb)
*/
sb->s_writers.frozen = SB_FREEZE_COMPLETE;
lockdep_sb_freeze_release(sb);
- up_write(&sb->s_umount);
return 0;
}
EXPORT_SYMBOL(freeze_super);

-static int thaw_super_locked(struct super_block *sb)
+/**
+ * thaw_super -- unlock a filesystem backed by a block device
+ * @sb: the super to thaw
+ *
+ * Used by filesystems and the kernel to thaw a fileystem backed by a block
+ * device. Callers must use get_active_super(bdev) to lock the @sb and when
+ * done must unlock it with deactivate_locked_super(). Once done, this marks
+ * the filesystem as writeable.
+ */
+int thaw_super(struct super_block *sb)
{
int error;

- if (sb->s_writers.frozen != SB_FREEZE_COMPLETE) {
- up_write(&sb->s_umount);
+ if (sb->s_writers.frozen != SB_FREEZE_COMPLETE)
return -EINVAL;
- }

if (sb_rdonly(sb)) {
sb->s_writers.frozen = SB_UNFROZEN;
@@ -1760,7 +1758,6 @@ static int thaw_super_locked(struct super_block *sb)
printk(KERN_ERR
"VFS:Filesystem thaw failed\n");
lockdep_sb_freeze_release(sb);
- up_write(&sb->s_umount);
return error;
}
}
@@ -1769,21 +1766,8 @@ static int thaw_super_locked(struct super_block *sb)
sb_freeze_unlock(sb, SB_FREEZE_FS);
out:
wake_up(&sb->s_writers.wait_unfrozen);
- deactivate_locked_super(sb);
return 0;
}
-
-/**
- * thaw_super -- unlock filesystem
- * @sb: the super to thaw
- *
- * Unlocks the filesystem and marks it writeable again after freeze_super().
- */
-int thaw_super(struct super_block *sb)
-{
- down_write(&sb->s_umount);
- return thaw_super_locked(sb);
-}
EXPORT_SYMBOL(thaw_super);

/*
--
2.39.2

2023-05-13 22:30:38

by Askar Safin

[permalink] [raw]
Subject: Re: [PATCH 0/6] vfs: provide automatic kernel freeze / resume

Will this patch fix a long-standing fuse vs suspend bug? (
https://bugzilla.kernel.org/show_bug.cgi?id=34932 )

Please, CC when answering

--
Askar Safin

2023-05-18 05:51:16

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [PATCH 1/6] fs: unify locking semantics for fs freeze / thaw

On Sun, May 07, 2023 at 06:17:12PM -0700, Luis Chamberlain wrote:
> Right now freeze_super() and thaw_super() are called with
> different locking contexts. To expand on this is messy, so
> just unify the requirement to require grabbing an active
> reference and keep the superblock locked.
>
> Suggested-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Luis Chamberlain <[email protected]>
> ---
> block/bdev.c | 5 +++-
> fs/f2fs/gc.c | 5 ++++
> fs/gfs2/super.c | 9 +++++--
> fs/gfs2/sys.c | 6 +++++
> fs/gfs2/util.c | 5 ++++
> fs/ioctl.c | 12 ++++++++--
> fs/super.c | 62 ++++++++++++++++++-------------------------------
> 7 files changed, 60 insertions(+), 44 deletions(-)
>
> diff --git a/block/bdev.c b/block/bdev.c
> index 21c63bfef323..dc54a2a1c46e 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -251,7 +251,7 @@ int freeze_bdev(struct block_device *bdev)
> error = sb->s_op->freeze_super(sb);
> else
> error = freeze_super(sb);
> - deactivate_super(sb);
> + deactivate_locked_super(sb);
>
> if (error) {
> bdev->bd_fsfreeze_count--;
> @@ -289,6 +289,8 @@ int thaw_bdev(struct block_device *bdev)
> sb = bdev->bd_fsfreeze_sb;
> if (!sb)
> goto out;
> + if (!get_active_super(bdev))
> + goto out;
>
> if (sb->s_op->thaw_super)
> error = sb->s_op->thaw_super(sb);
> @@ -298,6 +300,7 @@ int thaw_bdev(struct block_device *bdev)
> bdev->bd_fsfreeze_count++;
> else
> bdev->bd_fsfreeze_sb = NULL;
> + deactivate_locked_super(sb);
> out:
> mutex_unlock(&bdev->bd_fsfreeze_mutex);
> return error;
> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index 61c5f9d26018..e31d6791d3e3 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -2166,7 +2166,10 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
> if (err)
> return err;
>
> + if (!get_active_super(sbi->sb->s_bdev))
> + return -ENOTTY;
> freeze_super(sbi->sb);
> +
> f2fs_down_write(&sbi->gc_lock);
> f2fs_down_write(&sbi->cp_global_sem);
>
> @@ -2217,6 +2220,8 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
> out_err:
> f2fs_up_write(&sbi->cp_global_sem);
> f2fs_up_write(&sbi->gc_lock);
> + /* We use the same active reference from freeze */
> thaw_super(sbi->sb);
> + deactivate_locked_super(sbi->sb);
> return err;
> }
> diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> index 5eed8c237500..e57cb593e2f3 100644
> --- a/fs/gfs2/super.c
> +++ b/fs/gfs2/super.c
> @@ -676,7 +676,12 @@ void gfs2_freeze_func(struct work_struct *work)
> struct gfs2_sbd *sdp = container_of(work, struct gfs2_sbd, sd_freeze_work);
> struct super_block *sb = sdp->sd_vfs;
>
> - atomic_inc(&sb->s_active);
> + if (!get_active_super(sb->s_bdev)) {
> + fs_info(sdp, "GFS2: couldn't grap super for thaw for filesystem\n");
> + gfs2_assert_withdraw(sdp, 0);
> + return;
> + }
> +
> error = gfs2_freeze_lock(sdp, &freeze_gh, 0);
> if (error) {
> gfs2_assert_withdraw(sdp, 0);
> @@ -690,7 +695,7 @@ void gfs2_freeze_func(struct work_struct *work)
> }
> gfs2_freeze_unlock(&freeze_gh);
> }
> - deactivate_super(sb);
> + deactivate_locked_super(sb);
> clear_bit_unlock(SDF_FS_FROZEN, &sdp->sd_flags);
> wake_up_bit(&sdp->sd_flags, SDF_FS_FROZEN);
> return;
> diff --git a/fs/gfs2/sys.c b/fs/gfs2/sys.c
> index 454dc2ff8b5e..cbb71c3520c0 100644
> --- a/fs/gfs2/sys.c
> +++ b/fs/gfs2/sys.c
> @@ -164,6 +164,9 @@ static ssize_t freeze_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
> if (!capable(CAP_SYS_ADMIN))
> return -EPERM;
>
> + if (!get_active_super(sb->s_bdev))
> + return -ENOTTY;
> +
> switch (n) {
> case 0:
> error = thaw_super(sdp->sd_vfs);
> @@ -172,9 +175,12 @@ static ssize_t freeze_store(struct gfs2_sbd *sdp, const char *buf, size_t len)
> error = freeze_super(sdp->sd_vfs);
> break;
> default:
> + deactivate_locked_super(sb);
> return -EINVAL;
> }
>
> + deactivate_locked_super(sb);
> +
> if (error) {
> fs_warn(sdp, "freeze %d error %d\n", n, error);
> return error;
> diff --git a/fs/gfs2/util.c b/fs/gfs2/util.c
> index 7a6aeffcdf5c..3a0cd5e9ad84 100644
> --- a/fs/gfs2/util.c
> +++ b/fs/gfs2/util.c
> @@ -345,10 +345,15 @@ int gfs2_withdraw(struct gfs2_sbd *sdp)
> set_bit(SDF_WITHDRAW_IN_PROG, &sdp->sd_flags);
>
> if (sdp->sd_args.ar_errors == GFS2_ERRORS_WITHDRAW) {
> + if (!get_active_super(sb->s_bdev)) {
> + fs_err(sdp, "could not grab super on withdraw for file system\n");
> + return -1;
> + }
> fs_err(sdp, "about to withdraw this file system\n");
> BUG_ON(sdp->sd_args.ar_debug);
>
> signal_our_withdraw(sdp);
> + deactivate_locked_super(sb);
>
> kobject_uevent(&sdp->sd_kobj, KOBJ_OFFLINE);
>
> diff --git a/fs/ioctl.c b/fs/ioctl.c
> index 5b2481cd4750..1d20af762e0d 100644
> --- a/fs/ioctl.c
> +++ b/fs/ioctl.c
> @@ -386,6 +386,7 @@ static int ioctl_fioasync(unsigned int fd, struct file *filp,
> static int ioctl_fsfreeze(struct file *filp)
> {
> struct super_block *sb = file_inode(filp)->i_sb;
> + int ret;
>
> if (!ns_capable(sb->s_user_ns, CAP_SYS_ADMIN))
> return -EPERM;
> @@ -394,10 +395,17 @@ static int ioctl_fsfreeze(struct file *filp)
> if (sb->s_op->freeze_fs == NULL && sb->s_op->freeze_super == NULL)
> return -EOPNOTSUPP;
>
> + if (!get_active_super(sb->s_bdev))

Um... doesn't this mean that we now cannot freeze network and pseudo
filesystems?

> + return -ENOTTY;
> +
> /* Freeze */
> if (sb->s_op->freeze_super)
> - return sb->s_op->freeze_super(sb);
> - return freeze_super(sb);
> + ret = sb->s_op->freeze_super(sb);
> + ret = freeze_super(sb);
> +
> + deactivate_locked_super(sb);
> +
> + return ret;
> }
>
> static int ioctl_fsthaw(struct file *filp)
> diff --git a/fs/super.c b/fs/super.c
> index 34afe411cf2b..0e9d48846684 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -39,8 +39,6 @@
> #include <uapi/linux/mount.h>
> #include "internal.h"
>
> -static int thaw_super_locked(struct super_block *sb);
> -
> static LIST_HEAD(super_blocks);
> static DEFINE_SPINLOCK(sb_lock);
>
> @@ -851,13 +849,13 @@ struct super_block *get_active_super(struct block_device *bdev)
> if (sb->s_bdev == bdev) {
> if (!grab_super(sb))
> goto restart;
> - up_write(&sb->s_umount);
> return sb;
> }
> }
> spin_unlock(&sb_lock);
> return NULL;
> }
> +EXPORT_SYMBOL_GPL(get_active_super);
>
> struct super_block *user_get_super(dev_t dev, bool excl)
> {
> @@ -1024,13 +1022,13 @@ void emergency_remount(void)
>
> static void do_thaw_all_callback(struct super_block *sb)
> {
> - down_write(&sb->s_umount);
> + if (!get_active_super(sb->s_bdev))
> + return;
> if (sb->s_root && sb->s_flags & SB_BORN) {
> emergency_thaw_bdev(sb);
> - thaw_super_locked(sb);
> - } else {
> - up_write(&sb->s_umount);
> + thaw_super(sb);
> }
> + deactivate_locked_super(sb);
> }
>
> static void do_thaw_all(struct work_struct *work)
> @@ -1636,10 +1634,13 @@ static void sb_freeze_unlock(struct super_block *sb, int level)
> }
>
> /**
> - * freeze_super - lock the filesystem and force it into a consistent state
> + * freeze_super - force a filesystem backed by a block device into a consistent state
> * @sb: the super to lock
> *
> - * Syncs the super to make sure the filesystem is consistent and calls the fs's
> + * Used by filesystems and the kernel to freeze a fileystem backed by a block
> + * device into a consistent state. Callers must use get_active_super(bdev) to
> + * lock the @sb and when done must unlock it with deactivate_locked_super().
> + * Syncs the filesystem backed by the @sb and calls the filesystem's optional
> * freeze_fs. Subsequent calls to this without first thawing the fs will return
> * -EBUSY.
> *
> @@ -1672,22 +1673,15 @@ int freeze_super(struct super_block *sb)
> {
> int ret;
>
> - atomic_inc(&sb->s_active);
> - down_write(&sb->s_umount);
> - if (sb->s_writers.frozen != SB_UNFROZEN) {
> - deactivate_locked_super(sb);
> + if (sb->s_writers.frozen != SB_UNFROZEN)
> return -EBUSY;
> - }
>
> - if (!(sb->s_flags & SB_BORN)) {
> - up_write(&sb->s_umount);
> + if (!(sb->s_flags & SB_BORN))
> return 0; /* sic - it's "nothing to do" */
> - }
>
> if (sb_rdonly(sb)) {
> /* Nothing to do really... */
> sb->s_writers.frozen = SB_FREEZE_COMPLETE;
> - up_write(&sb->s_umount);
> return 0;
> }
>
> @@ -1707,7 +1701,6 @@ int freeze_super(struct super_block *sb)
> sb->s_writers.frozen = SB_UNFROZEN;
> sb_freeze_unlock(sb, SB_FREEZE_PAGEFAULT);
> wake_up(&sb->s_writers.wait_unfrozen);
> - deactivate_locked_super(sb);
> return ret;
> }
>
> @@ -1723,7 +1716,6 @@ int freeze_super(struct super_block *sb)
> sb->s_writers.frozen = SB_UNFROZEN;
> sb_freeze_unlock(sb, SB_FREEZE_FS);
> wake_up(&sb->s_writers.wait_unfrozen);
> - deactivate_locked_super(sb);
> return ret;
> }
> }
> @@ -1733,19 +1725,25 @@ int freeze_super(struct super_block *sb)
> */
> sb->s_writers.frozen = SB_FREEZE_COMPLETE;
> lockdep_sb_freeze_release(sb);
> - up_write(&sb->s_umount);
> return 0;

Also ... before this change, a successful freeze means that we retain
the elevated s_active during the freeze. Now the callers of this
function always call deactivate_locked_super, which decrements the
active count, even if the fs is actually frozen. The active count no
longer works the way it used to. Can that lead to freeing a frozen
super?

Alternately, I guess we could take our own s_active reference here.

--D

> }
> EXPORT_SYMBOL(freeze_super);
>
> -static int thaw_super_locked(struct super_block *sb)
> +/**
> + * thaw_super -- unlock a filesystem backed by a block device
> + * @sb: the super to thaw
> + *
> + * Used by filesystems and the kernel to thaw a fileystem backed by a block
> + * device. Callers must use get_active_super(bdev) to lock the @sb and when
> + * done must unlock it with deactivate_locked_super(). Once done, this marks
> + * the filesystem as writeable.
> + */
> +int thaw_super(struct super_block *sb)
> {
> int error;
>
> - if (sb->s_writers.frozen != SB_FREEZE_COMPLETE) {
> - up_write(&sb->s_umount);
> + if (sb->s_writers.frozen != SB_FREEZE_COMPLETE)
> return -EINVAL;
> - }
>
> if (sb_rdonly(sb)) {
> sb->s_writers.frozen = SB_UNFROZEN;
> @@ -1760,7 +1758,6 @@ static int thaw_super_locked(struct super_block *sb)
> printk(KERN_ERR
> "VFS:Filesystem thaw failed\n");
> lockdep_sb_freeze_release(sb);
> - up_write(&sb->s_umount);
> return error;
> }
> }
> @@ -1769,21 +1766,8 @@ static int thaw_super_locked(struct super_block *sb)
> sb_freeze_unlock(sb, SB_FREEZE_FS);
> out:
> wake_up(&sb->s_writers.wait_unfrozen);
> - deactivate_locked_super(sb);
> return 0;
> }
> -
> -/**
> - * thaw_super -- unlock filesystem
> - * @sb: the super to thaw
> - *
> - * Unlocks the filesystem and marks it writeable again after freeze_super().
> - */
> -int thaw_super(struct super_block *sb)
> -{
> - down_write(&sb->s_umount);
> - return thaw_super_locked(sb);
> -}
> EXPORT_SYMBOL(thaw_super);
>
> /*
> --
> 2.39.2
>

2023-05-25 12:31:12

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH 1/6] fs: unify locking semantics for fs freeze / thaw

On Sun 07-05-23 18:17:12, Luis Chamberlain wrote:
> Right now freeze_super() and thaw_super() are called with
> different locking contexts. To expand on this is messy, so
> just unify the requirement to require grabbing an active
> reference and keep the superblock locked.
>
> Suggested-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Luis Chamberlain <[email protected]>

Finally got around to looking at this. Sorry for the delay. In principle I
like the direction but see below:

> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> index 61c5f9d26018..e31d6791d3e3 100644
> --- a/fs/f2fs/gc.c
> +++ b/fs/f2fs/gc.c
> @@ -2166,7 +2166,10 @@ int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count)
> if (err)
> return err;
>
> + if (!get_active_super(sbi->sb->s_bdev))
> + return -ENOTTY;

Calling get_active_super() like this is just sick. You rather want to
provide a helper for grabbing another active sb reference and locking the
sb when you already have sb reference. Because that is what is needed in
the vast majority of the places. Something like

void grab_active_super(struct super_block *sb)
{
down_write(sb->s_umount);
atomic_inc(&s->s_active);
}

> @@ -851,13 +849,13 @@ struct super_block *get_active_super(struct block_device *bdev)
> if (sb->s_bdev == bdev) {
> if (!grab_super(sb))
> goto restart;
> - up_write(&sb->s_umount);
> return sb;
> }
> }
> spin_unlock(&sb_lock);
> return NULL;
> }
> +EXPORT_SYMBOL_GPL(get_active_super);

And I'd call this grab_bdev_super() and no need to export it when you have
grab_active_super().

> @@ -1636,10 +1634,13 @@ static void sb_freeze_unlock(struct super_block *sb, int level)
> }
>
> /**
> - * freeze_super - lock the filesystem and force it into a consistent state
> + * freeze_super - force a filesystem backed by a block device into a consistent state
> * @sb: the super to lock
> *
> - * Syncs the super to make sure the filesystem is consistent and calls the fs's
> + * Used by filesystems and the kernel to freeze a fileystem backed by a block
> + * device into a consistent state. Callers must use get_active_super(bdev) to
> + * lock the @sb and when done must unlock it with deactivate_locked_super().
> + * Syncs the filesystem backed by the @sb and calls the filesystem's optional
> * freeze_fs. Subsequent calls to this without first thawing the fs will return
> * -EBUSY.
> *
> @@ -1672,22 +1673,15 @@ int freeze_super(struct super_block *sb)
> {
> int ret;
>
> - atomic_inc(&sb->s_active);
> - down_write(&sb->s_umount);
> - if (sb->s_writers.frozen != SB_UNFROZEN) {
> - deactivate_locked_super(sb);

At least add a warning for s_umount not being held here?

> + if (sb->s_writers.frozen != SB_UNFROZEN)
> return -EBUSY;
> - }
>
> - if (!(sb->s_flags & SB_BORN)) {
> - up_write(&sb->s_umount);
> + if (!(sb->s_flags & SB_BORN))
> return 0; /* sic - it's "nothing to do" */
> - }
>
> if (sb_rdonly(sb)) {
> /* Nothing to do really... */
> sb->s_writers.frozen = SB_FREEZE_COMPLETE;
> - up_write(&sb->s_umount);
> return 0;
> }

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2023-06-06 15:00:44

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH 0/6] vfs: provide automatic kernel freeze / resume

On Sun, 14 May 2023 at 00:04, Askar Safin <[email protected]> wrote:
>
> Will this patch fix a long-standing fuse vs suspend bug? (
> https://bugzilla.kernel.org/show_bug.cgi?id=34932 )

No.

The solution to the fuse issue is to freeze processes that initiate
fuse requests *before* freezing processes that serve fuse requests.

The problem is finding out which is which. This can be complicated by
the fact that a process could be both serving requests *and*
initiating them (even without knowing).

The best idea so far is to let fuse servers set a process flag
(PF_FREEZE_LATE) that is inherited across fork/clone. For example the
sshfs server would do the following before starting request processing
or starting ssh:

echo 1 > /proc/self/freeze_late

This would make the sshfs and ssh processes be frozen after processes
that call into the sshfs mount.

After normal (non-server) processes are frozen, server processes
should not be getting new requests and can be frozen.

Issues remaining:

- if requests are stuck (e.g. network is down) then the requester
process can't be frozen and suspend will still fail.

- if server process is generating filesystem activity (new fuse
requests) spontaneously, then there's nothing to differentiate between
server processes and we are back to the original problem

Solution to both these are probably non-kernel: impacted servers need
to receive notification from systemd when suspend is starting and act
accordingly.

Attaching work-in-progress patch. This needs to be improved to freeze
server processes in a separate phase from kernel threads, but it
should be able to demonstrate the idea.

Thanks,
Miklos


Attachments:
0001_freezer_configure_user_space_process_frozen_along_with_kernel_threads.patch (6.95 kB)

2023-06-06 16:08:39

by Askar Safin

[permalink] [raw]
Subject: Re: [PATCH 0/6] vfs: provide automatic kernel freeze / resume

Thanks a lot for the answer

On Tue, Jun 6, 2023 at 5:38 PM Miklos Szeredi <[email protected]> wrote:
> - if requests are stuck (e.g. network is down) then the requester
> process can't be frozen and suspend will still fail

Unfortunately, this is exactly the problem I sometimes face. If
network is up, then suspend works normally. But if network is down
(and I'm trying to access sshfs filesystem in that moment), then
suspend doesn't work.

So, it seems your solution is not for me

> Solution to both these are probably non-kernel: impacted servers need
> to receive notification from systemd when suspend is starting and act
> accordingly.
Okay, I will probably forward this to sshfs devs

--
Askar Safin

2023-06-06 19:49:15

by Nikolaus Rath

[permalink] [raw]
Subject: Re: [fuse-devel] [PATCH 0/6] vfs: provide automatic kernel freeze / resume

On Jun 06 2023, Miklos Szeredi via fuse-devel <[email protected]> wrote:
> On Sun, 14 May 2023 at 00:04, Askar Safin <[email protected]> wrote:
>>
>> Will this patch fix a long-standing fuse vs suspend bug? (
>> https://bugzilla.kernel.org/show_bug.cgi?id=34932 )
>
> No.
>
> The solution to the fuse issue is to freeze processes that initiate
> fuse requests *before* freezing processes that serve fuse requests.
>
> The problem is finding out which is which. This can be complicated by
> the fact that a process could be both serving requests *and*
> initiating them (even without knowing).
>
> The best idea so far is to let fuse servers set a process flag
> (PF_FREEZE_LATE) that is inherited across fork/clone.

Is that the same as what userspace calls PR_SET_IO_FLUSHER? From
prctl(2):

PR_SET_IO_FLUSHER (since Linux 5.6)
If a user process is involved in the block layer or filesystem I/O path, and
can allocate memory while processing I/O requests it must set arg2 to 1. This
will put the process in the IO_FLUSHER state, which allows it special treat‐
ment to make progress when allocating memory. [..]

The calling process must have the CAP_SYS_RESOURCE capability.[...]

Examples of IO_FLUSHER applications are FUSE daemons, SCSI device emulation
daemons, and daemons that perform error handling like multipath path recovery
applications.


To me this sounds like it captures the relevant information (process is
involved in filesystem I/O) rather than just a preferred behavior (flush
late) and may thus be a better choice...

Best,
-Nikolaus

2023-06-06 20:40:29

by Bernd Schubert

[permalink] [raw]
Subject: Re: [PATCH 0/6] vfs: provide automatic kernel freeze / resume



On 6/6/23 16:37, Miklos Szeredi wrote:
> On Sun, 14 May 2023 at 00:04, Askar Safin <[email protected]> wrote:
>>
>> Will this patch fix a long-standing fuse vs suspend bug? (
>> https://bugzilla.kernel.org/show_bug.cgi?id=34932 )
>
> No.
>
> The solution to the fuse issue is to freeze processes that initiate
> fuse requests *before* freezing processes that serve fuse requests.
>
> The problem is finding out which is which. This can be complicated by
> the fact that a process could be both serving requests *and*
> initiating them (even without knowing).
>
> The best idea so far is to let fuse servers set a process flag
> (PF_FREEZE_LATE) that is inherited across fork/clone. For example the
> sshfs server would do the following before starting request processing
> or starting ssh:
>
> echo 1 > /proc/self/freeze_late
>
> This would make the sshfs and ssh processes be frozen after processes
> that call into the sshfs mount.

Hmm, why would this need to be done manually on the server (daemon)
side? It could be automated on the fuse kernel side, for example in
process_init_reply() using current task context?

A slightly better version would give scores, the later the daemon/server
is created the higher its freezing score - would help a bit with stacked
fuse file systems, although not perfectly. For that struct task would
need to be extended, though.

>
> After normal (non-server) processes are frozen, server processes
> should not be getting new requests and can be frozen.
>
> Issues remaining:
>
> - if requests are stuck (e.g. network is down) then the requester
> process can't be frozen and suspend will still fail.
>
> - if server process is generating filesystem activity (new fuse
> requests) spontaneously, then there's nothing to differentiate between
> server processes and we are back to the original problem
>
> Solution to both these are probably non-kernel: impacted servers need
> to receive notification from systemd when suspend is starting and act
> accordingly.
>
> Attaching work-in-progress patch. This needs to be improved to freeze
> server processes in a separate phase from kernel threads, but it
> should be able to demonstrate the idea.


Thanks,
Bernd

2023-06-07 07:21:55

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [fuse-devel] [PATCH 0/6] vfs: provide automatic kernel freeze / resume

On Tue, 6 Jun 2023 at 21:37, Nikolaus Rath <[email protected]> wrote:
>
> On Jun 06 2023, Miklos Szeredi via fuse-devel <[email protected]> wrote:
> > On Sun, 14 May 2023 at 00:04, Askar Safin <[email protected]> wrote:
> >>
> >> Will this patch fix a long-standing fuse vs suspend bug? (
> >> https://bugzilla.kernel.org/show_bug.cgi?id=34932 )
> >
> > No.
> >
> > The solution to the fuse issue is to freeze processes that initiate
> > fuse requests *before* freezing processes that serve fuse requests.
> >
> > The problem is finding out which is which. This can be complicated by
> > the fact that a process could be both serving requests *and*
> > initiating them (even without knowing).
> >
> > The best idea so far is to let fuse servers set a process flag
> > (PF_FREEZE_LATE) that is inherited across fork/clone.
>
> Is that the same as what userspace calls PR_SET_IO_FLUSHER? From
> prctl(2):
>
> PR_SET_IO_FLUSHER (since Linux 5.6)
> If a user process is involved in the block layer or filesystem I/O path, and
> can allocate memory while processing I/O requests it must set arg2 to 1. This
> will put the process in the IO_FLUSHER state, which allows it special treat‐
> ment to make progress when allocating memory. [..]
>
> The calling process must have the CAP_SYS_RESOURCE capability.[...]

This is the issue. We want suspend to work without needing privileges.

>
> Examples of IO_FLUSHER applications are FUSE daemons, SCSI device emulation
> daemons, and daemons that perform error handling like multipath path recovery
> applications.

This looks incorrect. FUSE shouldn't need this because it manages
writeback in a way not to require such special treatment.

It might make sense to use the prctl(2) API for this, but honestly I
prefer pseudo fs interfaces for such knobs.

Thanks,
Miklos

2023-06-07 07:39:15

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH 0/6] vfs: provide automatic kernel freeze / resume

On Tue, 6 Jun 2023 at 22:18, Bernd Schubert <[email protected]> wrote:
>
>
>
> On 6/6/23 16:37, Miklos Szeredi wrote:
> > On Sun, 14 May 2023 at 00:04, Askar Safin <[email protected]> wrote:
> >>
> >> Will this patch fix a long-standing fuse vs suspend bug? (
> >> https://bugzilla.kernel.org/show_bug.cgi?id=34932 )
> >
> > No.
> >
> > The solution to the fuse issue is to freeze processes that initiate
> > fuse requests *before* freezing processes that serve fuse requests.
> >
> > The problem is finding out which is which. This can be complicated by
> > the fact that a process could be both serving requests *and*
> > initiating them (even without knowing).
> >
> > The best idea so far is to let fuse servers set a process flag
> > (PF_FREEZE_LATE) that is inherited across fork/clone. For example the
> > sshfs server would do the following before starting request processing
> > or starting ssh:
> >
> > echo 1 > /proc/self/freeze_late
> >
> > This would make the sshfs and ssh processes be frozen after processes
> > that call into the sshfs mount.
>
> Hmm, why would this need to be done manually on the server (daemon)
> side? It could be automated on the fuse kernel side, for example in
> process_init_reply() using current task context?

Setting the flag for the current task wouldn't be sufficient, it would
need to set it for all threads of a process. Even that wouldn't work
for e.g. sshfs, which forks off ssh before starting request
processing.

So I'd prefer setting this explicitly. This could be done from
libfuse, before starting threads. Or, as in the case of sshfs, it
could be done by the filesystem itself.

>
> A slightly better version would give scores, the later the daemon/server
> is created the higher its freezing score - would help a bit with stacked
> fuse file systems, although not perfectly. For that struct task would
> need to be extended, though.

If we can quiesce the top of the stack, then hopefully all the lower
ones will also have no activity. There could be special cases, but
that would need to be dealt with in the fuse server itself.

Thanks,
Miklos

2023-06-07 08:35:45

by Bernd Schubert

[permalink] [raw]
Subject: Re: [PATCH 0/6] vfs: provide automatic kernel freeze / resume

On 6/7/23 09:21, Miklos Szeredi wrote:
> On Tue, 6 Jun 2023 at 22:18, Bernd Schubert <[email protected]> wrote:
>>
>>
>>
>> On 6/6/23 16:37, Miklos Szeredi wrote:
>>> On Sun, 14 May 2023 at 00:04, Askar Safin <[email protected]> wrote:
>>>>
>>>> Will this patch fix a long-standing fuse vs suspend bug? (
>>>> https://bugzilla.kernel.org/show_bug.cgi?id=34932 )
>>>
>>> No.
>>>
>>> The solution to the fuse issue is to freeze processes that initiate
>>> fuse requests *before* freezing processes that serve fuse requests.
>>>
>>> The problem is finding out which is which. This can be complicated by
>>> the fact that a process could be both serving requests *and*
>>> initiating them (even without knowing).
>>>
>>> The best idea so far is to let fuse servers set a process flag
>>> (PF_FREEZE_LATE) that is inherited across fork/clone. For example the
>>> sshfs server would do the following before starting request processing
>>> or starting ssh:
>>>
>>> echo 1 > /proc/self/freeze_late
>>>
>>> This would make the sshfs and ssh processes be frozen after processes
>>> that call into the sshfs mount.
>>
>> Hmm, why would this need to be done manually on the server (daemon)
>> side? It could be automated on the fuse kernel side, for example in
>> process_init_reply() using current task context?
>
> Setting the flag for the current task wouldn't be sufficient, it would
> need to set it for all threads of a process. Even that wouldn't work
> for e.g. sshfs, which forks off ssh before starting request
> processing.

Assuming a fuse server process is not handing over requests to other
threads/forked-processes, isn't the main issue that all fuse server
tasks are frozen and none is left to take requests? A single non-frozen
thread should be sufficient for that?


>
> So I'd prefer setting this explicitly. This could be done from
> libfuse, before starting threads. Or, as in the case of sshfs, it
> could be done by the filesystem itself.

With a flag that should work, with my score proposal it would be difficult.

>
>>
>> A slightly better version would give scores, the later the daemon/server
>> is created the higher its freezing score - would help a bit with stacked
>> fuse file systems, although not perfectly. For that struct task would
>> need to be extended, though.
>
> If we can quiesce the top of the stack, then hopefully all the lower
> ones will also have no activity. There could be special cases, but
> that would need to be dealt with in the fuse server itself.


Ah, when all non flagged processes are frozen first no IO should come
in. Yeah, it mostly works, but I wonder if init/systemd is not going to
set that flag as well. And then you have an issue when fuse is on a file
system used by systemd. My long time ago initial interest on fuse is to
use fuse as root file system and I still do that for some cases - not
sure if a flag would be sufficient here. I think a freezing score would
solve more issues.
Although probably better to do step by step - flag first and score can
be added later.



Thanks,
Bernd

2023-06-07 09:49:13

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH 0/6] vfs: provide automatic kernel freeze / resume

On Wed, 7 Jun 2023 at 10:13, Bernd Schubert <[email protected]> wrote:

> Assuming a fuse server process is not handing over requests to other
> threads/forked-processes, isn't the main issue that all fuse server
> tasks are frozen and none is left to take requests? A single non-frozen
> thread should be sufficient for that?

This *might* work. But there could be auxiliary threads, or the
initial thread could be killed, etc. It would not be reliable.

> Ah, when all non flagged processes are frozen first no IO should come
> in. Yeah, it mostly works, but I wonder if init/systemd is not going to
> set that flag as well. And then you have an issue when fuse is on a file
> system used by systemd. My long time ago initial interest on fuse is to
> use fuse as root file system and I still do that for some cases - not
> sure if a flag would be sufficient here. I think a freezing score would
> solve more issues.
> Although probably better to do step by step - flag first and score can
> be added later.

I'm not sure how systemd interacts with the freezing process. If it
does, then the only sane thing to do is to make sure it doesn't have
any filesystem interaction during that time.

Hibernation is a different matter, because it needs the filesystem to
be in a working state while all userspace is frozen. Hibernate to
fuse would bring up a lot of interesting design questions.

Thanks,
Miklos

2023-06-07 11:30:10

by Askar Safin

[permalink] [raw]
Subject: Re: [PATCH 0/6] vfs: provide automatic kernel freeze / resume

I found a workaround for sshfs+suspend problem!

On Tue, Jun 6, 2023 at 5:38 PM Miklos Szeredi <[email protected]> wrote:
> Issues remaining:
>
> - if requests are stuck (e.g. network is down) then the requester
> process can't be frozen and suspend will still fail.

> Solution to both these are probably non-kernel: impacted servers need
> to receive notification from systemd when suspend is starting and act
> accordingly.

Okay, so you said that the only way to solve the problem "network is
down" is to fix the problem at the sshfs side. Unfortunately, sshfs
project was closed ( https://github.com/libfuse/sshfs ). So the only
remaining option is to use some hack. And I found such a hack!

I simply added "-o ServerAliveInterval=10" to sshfs command. This will
cause ssh process exit if remote side is unreachable. Thus the bug is
prevented. I tested the fix and it works.

But this will mean that ssh process will exit in such situation, and
thus sshfs process will exit, too. If this is not what you want, then
also add "-o reconnect" option, this will restart connection if ssh
dies. So the final command will look like this:

sshfs -o reconnect,ServerAliveInterval=10 ...

This finally solved the problem for me.

But one issue still remains: if the network goes down and then you
immediately try to access sshfs filesystem and then you will try to
suspend (and ssh doesn't yet noticed that the network gone down), then
suspend will still fail. (I tested this situation, the suspend
actually fails.) But I don't think I even will reach such situation.
The lesser ServerAliveInterval you will set, the lower is probability
that you will reach such situation

--
Askar Safin

2023-06-07 14:16:36

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH 0/6] vfs: provide automatic kernel freeze / resume

On Wed, 7 Jun 2023 at 13:13, Askar Safin <[email protected]> wrote:
>
> I found a workaround for sshfs+suspend problem!
>
> On Tue, Jun 6, 2023 at 5:38 PM Miklos Szeredi <[email protected]> wrote:
> > Issues remaining:
> >
> > - if requests are stuck (e.g. network is down) then the requester
> > process can't be frozen and suspend will still fail.
>
> > Solution to both these are probably non-kernel: impacted servers need
> > to receive notification from systemd when suspend is starting and act
> > accordingly.
>
> Okay, so you said that the only way to solve the problem "network is
> down" is to fix the problem at the sshfs side. Unfortunately, sshfs
> project was closed ( https://github.com/libfuse/sshfs ). So the only
> remaining option is to use some hack. And I found such a hack!
>
> I simply added "-o ServerAliveInterval=10" to sshfs command. This will
> cause ssh process exit if remote side is unreachable. Thus the bug is
> prevented. I tested the fix and it works.
>
> But this will mean that ssh process will exit in such situation, and
> thus sshfs process will exit, too. If this is not what you want, then
> also add "-o reconnect" option, this will restart connection if ssh
> dies. So the final command will look like this:
>
> sshfs -o reconnect,ServerAliveInterval=10 ...
>
> This finally solved the problem for me.
>
> But one issue still remains: if the network goes down and then you
> immediately try to access sshfs filesystem and then you will try to
> suspend (and ssh doesn't yet noticed that the network gone down), then
> suspend will still fail.

If the ServerAliveInterval is set to less than the freeze timeout (20s
by default) and you apply the patch and start sshfs like below, then
suspend should be reliable.

(echo 1 > /proc/self/freeze_late; sshfs ...)

Thanks,
Miklos

2023-06-08 05:14:53

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/6] fs: unify locking semantics for fs freeze / thaw

On Sun, May 07, 2023 at 06:17:12PM -0700, Luis Chamberlain wrote:
> Right now freeze_super() and thaw_super() are called with
> different locking contexts. To expand on this is messy, so
> just unify the requirement to require grabbing an active
> reference and keep the superblock locked.
>
> Suggested-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Luis Chamberlain <[email protected]>

Maybe I'm just getting old, but where did I suggest this?

That being said, holding an active reference over any operation is a
good thing. As Jan said it can be done way simpler than this, though.

Also please explain the actual different locking contexts and what
semantics you unify in the commit log please.

> - * freeze_super - lock the filesystem and force it into a consistent state
> + * freeze_super - force a filesystem backed by a block device into a consistent state
> * @sb: the super to lock

This is not actually true. Nothing in here has anything to do
with block devices, and it is used on a at least one non-block file
system.

2023-06-08 20:16:46

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH 1/6] fs: unify locking semantics for fs freeze / thaw

On Wed, Jun 07, 2023 at 10:01:14PM -0700, Christoph Hellwig wrote:
> On Sun, May 07, 2023 at 06:17:12PM -0700, Luis Chamberlain wrote:
> > Right now freeze_super() and thaw_super() are called with
> > different locking contexts. To expand on this is messy, so
> > just unify the requirement to require grabbing an active
> > reference and keep the superblock locked.
> >
> > Suggested-by: Christoph Hellwig <[email protected]>
> > Signed-off-by: Luis Chamberlain <[email protected]>
>
> Maybe I'm just getting old, but where did I suggest this?

https://lore.kernel.org/all/[email protected]/

"I don't think we need both variants, just move the locking and s_active
acquisition out of free_super. Same for the thaw side."

> That being said, holding an active reference over any operation is a
> good thing. As Jan said it can be done way simpler than this, though.

Great.


Luis

2023-06-09 12:52:13

by Askar Safin

[permalink] [raw]
Subject: Re: [PATCH 0/6] vfs: provide automatic kernel freeze / resume

> If the ServerAliveInterval is set to less than the freeze timeout (20s
> by default) and you apply the patch and start sshfs like below, then
> suspend should be reliable.
>
> (echo 1 > /proc/self/freeze_late; sshfs ...)

Thanks a lot. I don't want to apply patches. I'm happy with my solution.

Note for other people reading this thread: ssh docs say that actual
timeout is "ServerAliveInterval * ServerAliveCountMax", so if someone
will apply this patch, he should ensure that ServerAliveInterval *
ServerAliveCountMax is less than 20 seconds.


--
Askar Safin