2012-02-01 13:36:47

by Pavel Machek

[permalink] [raw]
Subject: Re: [RFC][PATCH] PM / Sleep: Freeze filesystems during system suspend/hibernation

Hi!

> From: Rafael J. Wysocki <[email protected]>
>
> Freeze all filesystems during system suspend and (kernel-driven)
> hibernation by calling freeze_supers() for all superblocks and thaw
> them during the subsequent resume with the help of thaw_supers().
>
> This makes filesystems stay in a consistent state in case something
> goes wrong between system suspend (or hibernation) and the subsequent
> resume (e.g. journal replays won't be necessary in those cases). In

Good.

> particular, this should help to solve a long-standing issue that, in
> some cases, during resume from hibernation the boot loader causes the
> journal to be replied for the filesystem containing the kernel image
> and/or initrd causing it to become inconsistent with the information
> stored in the hibernation image.

Ungood. Why is bootloader/initrd doing that? If it mounts filesystem
read/write, what is the guarantee that it will not change data on the
filesystem, breaking stuff?

Bootloaders should just not replay journals.

> The user-space-driven hibernation (s2disk) is not covered by this
> change, because the freezing of filesystems prevents s2disk from
> accessing device special files it needs to do its job.

...so bootloaders need to be fixed, anyway.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


2012-02-01 15:29:50

by Alan Stern

[permalink] [raw]
Subject: Re: [RFC][PATCH] PM / Sleep: Freeze filesystems during system suspend/hibernation

On Wed, 1 Feb 2012, Pavel Machek wrote:

> > particular, this should help to solve a long-standing issue that, in
> > some cases, during resume from hibernation the boot loader causes the
> > journal to be replied for the filesystem containing the kernel image
> > and/or initrd causing it to become inconsistent with the information
> > stored in the hibernation image.
>
> Ungood. Why is bootloader/initrd doing that? If it mounts filesystem
> read/write, what is the guarantee that it will not change data on the
> filesystem, breaking stuff?
>
> Bootloaders should just not replay journals.
>
> > The user-space-driven hibernation (s2disk) is not covered by this
> > change, because the freezing of filesystems prevents s2disk from
> > accessing device special files it needs to do its job.
>
> ...so bootloaders need to be fixed, anyway.

I don't know about bootloaders, but from what I've heard, Linux fs
drivers (including those inside initrds) always replay the journal,
even if the filesystem is mounted read-only. This could be considered
a bug in the filesystem code.

Alan Stern

2012-02-10 02:52:32

by Jamie Lokier

[permalink] [raw]
Subject: Re: [RFC][PATCH] PM / Sleep: Freeze filesystems during system suspend/hibernation

Alan Stern wrote:
> On Wed, 1 Feb 2012, Pavel Machek wrote:
>
> > > particular, this should help to solve a long-standing issue that, in
> > > some cases, during resume from hibernation the boot loader causes the
> > > journal to be replied for the filesystem containing the kernel image
> > > and/or initrd causing it to become inconsistent with the information
> > > stored in the hibernation image.
> >
> > Ungood. Why is bootloader/initrd doing that? If it mounts filesystem
> > read/write, what is the guarantee that it will not change data on the
> > filesystem, breaking stuff?
> >
> > Bootloaders should just not replay journals.
> >
> > > The user-space-driven hibernation (s2disk) is not covered by this
> > > change, because the freezing of filesystems prevents s2disk from
> > > accessing device special files it needs to do its job.
> >
> > ...so bootloaders need to be fixed, anyway.
>
> I don't know about bootloaders, but from what I've heard, Linux fs
> drivers (including those inside initrds) always replay the journal,
> even if the filesystem is mounted read-only. This could be considered
> a bug in the filesystem code.

Theoretically a filesystem might need replay for the bootloader to see
a non-corrupt image, even for just the files it uses.

For example if the last state was in the middle of updating the root
directory, the /boot entry in the root directory might not be reliably
found without replaying the journal.

However replaying a journal when mounted read-only should probably
track journalled blocks in memory only, not commit back to the storage.

All the best,
-- Jamie

2012-02-10 09:03:36

by Jan Kara

[permalink] [raw]
Subject: Re: [RFC][PATCH] PM / Sleep: Freeze filesystems during system suspend/hibernation

On Fri 10-02-12 02:52:17, Jamie Lokier wrote:
> Alan Stern wrote:
> > On Wed, 1 Feb 2012, Pavel Machek wrote:
> >
> > > > particular, this should help to solve a long-standing issue that, in
> > > > some cases, during resume from hibernation the boot loader causes the
> > > > journal to be replied for the filesystem containing the kernel image
> > > > and/or initrd causing it to become inconsistent with the information
> > > > stored in the hibernation image.
> > >
> > > Ungood. Why is bootloader/initrd doing that? If it mounts filesystem
> > > read/write, what is the guarantee that it will not change data on the
> > > filesystem, breaking stuff?
> > >
> > > Bootloaders should just not replay journals.
> > >
> > > > The user-space-driven hibernation (s2disk) is not covered by this
> > > > change, because the freezing of filesystems prevents s2disk from
> > > > accessing device special files it needs to do its job.
> > >
> > > ...so bootloaders need to be fixed, anyway.
> >
> > I don't know about bootloaders, but from what I've heard, Linux fs
> > drivers (including those inside initrds) always replay the journal,
> > even if the filesystem is mounted read-only. This could be considered
> > a bug in the filesystem code.
>
> Theoretically a filesystem might need replay for the bootloader to see
> a non-corrupt image, even for just the files it uses.
>
> For example if the last state was in the middle of updating the root
> directory, the /boot entry in the root directory might not be reliably
> found without replaying the journal.
>
> However replaying a journal when mounted read-only should probably
> track journalled blocks in memory only, not commit back to the storage.
Yes, that would be nice. Although memory constraints of the bootloader
might make it tricky.

The reason why we don't have the functionality in kernel is not so much
about memory demands but more about the complexities of implementing the
caching - we'd have to create an equivalent of dm-snapshot of the device
from the mount code to trick generic code in pagecache to loading proper
replayed data).

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR