2017-12-01 19:05:49

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH 00/11] fs: use freeze_fs on suspend/hibernate

On Thu, 2017-11-30 at 17:41 +0100, Jiri Kosina wrote:
> On Fri, 1 Dec 2017, Yu Chen wrote:
>
> > BTW, is nfs able to be included in this set? I also encountered a
> > freeze() failure due to nfs access during that stage recently.
>
> The freezer usage in NFS is magnitudes more complicated, so it makes sense
> to first go after the lower hanging fruit to figure out the viability of
> the whole aproach in practice.
>

Agreed that we should do this in stages. It doesn't help that freezer
handling in the client is a bit of a mess at this point...

At a high level for NFS, I think we need to have freeze_fs make the RPC
engine "park" newly issued RPCs for that fs' client onto a
rpc_wait_queue. Any RPC that has already been sent however, we need to
wait for a reply.

Once everything is quiesced we can return and call it frozen.
unfreeze_fs can then just have the engine stop parking RPCs and wake up
the waitq.

That should be enough to make suspend and resume work more reliably. If,
however, you're interested in making the cgroup freezer also work, then
we may need to do a bit more work to ensure that we don't end up with
frozen tasks squatting on VFS locks.
--
Jeff Layton <[email protected]>


2017-12-01 21:51:17

by Dave Chinner

[permalink] [raw]
Subject: Re: [PATCH 00/11] fs: use freeze_fs on suspend/hibernate

On Fri, Dec 01, 2017 at 02:05:44PM -0500, Jeff Layton wrote:
> On Thu, 2017-11-30 at 17:41 +0100, Jiri Kosina wrote:
> > On Fri, 1 Dec 2017, Yu Chen wrote:
> >
> > > BTW, is nfs able to be included in this set? I also encountered a
> > > freeze() failure due to nfs access during that stage recently.
> >
> > The freezer usage in NFS is magnitudes more complicated, so it makes sense
> > to first go after the lower hanging fruit to figure out the viability of
> > the whole aproach in practice.
> >
>
> Agreed that we should do this in stages. It doesn't help that freezer
> handling in the client is a bit of a mess at this point...
>
> At a high level for NFS, I think we need to have freeze_fs make the RPC
> engine "park" newly issued RPCs for that fs' client onto a
> rpc_wait_queue. Any RPC that has already been sent however, we need to
> wait for a reply.
>
> Once everything is quiesced we can return and call it frozen.
> unfreeze_fs can then just have the engine stop parking RPCs and wake up
> the waitq.

That seems pretty reasonable. freezing is expected to take a bit of
time to run - local filesystems can do a fair bit of IO draining
queues, inflight operations and bringing the journal into a
consistent state on disk before declaring the filesystem is frozen.

> That should be enough to make suspend and resume work more reliably. If,
> however, you're interested in making the cgroup freezer also work, then
> we may need to do a bit more work to ensure that we don't end up with
> frozen tasks squatting on VFS locks.

None of the existing freezing code gives those guarantees. In fact,
freezing a filesystem pretty much guarantees the opposite - that
tasks *will freeze when holding VFS locks* - and so the cgroup
freezer is broken by design if it requires tasks to be frozen
without holding any VFS/filesystem lock context. So I wouldn't
really worry about the cgroup freezer....

Cheers,

Dave.
--
Dave Chinner
[email protected]