2015-11-02 02:14:05

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 0/3] PM, vfs: use filesystem freezing instead of kthread freezer

On Saturday, October 31, 2015 09:19:33 AM Jiri Kosina wrote:
> On Sat, 31 Oct 2015, Rafael J. Wysocki wrote:
>
> > > > > > > I would say instead "no I/O is allowed from now on". Maybe that's an
> > > > > > > overstatement, but I think it comes closer to the truth.
> > > > >
> > > > > But that's what PM callbacks are for.
> >
> > Not really. In fact, PM callbacks may not be suitable for some use cases even
> > in theory (that's if you want something to stop running before any PM callbacks
> > are executed during suspend).
>
> If that's the case, wouldn't then it be better to have .pre_suspend
> callback as well, instead of having to do this kind of polling?

If that's an individual driver's kthread, I guess it should be sufficient to
stop it from the .prepare callback. If it is something more generic, creating
a device for it just in order to be able to execute a PM callback from there
may be slightly overkill.

But again, I'm not aware of any real cases like that, so it may not be an issue
in practice.

> > > > The most natural implementation would be for the callback routine to set
> > > > a flag; at various strategic points the kthread would check the flag and
> > > > if it was set, call a routine that sits around and waits for the suspend
> > > > to be over.
> > >
> > > Could you name at least some existing kthreads that would actually *need*
> > > such complex handling, instead of just waiting in schedule() until
> > > suspend-resume cycle is over, given that PM callbacks do all the necessary
> > > cleanup (putting HW to sleep, cancelling timers, etc) anyway?
> > >
> > > PM callback can always explicitly do kthread_stop() on a particular
> > > kthread if really necessary.
> >
> > Runtime PM uses a freezable workqueue, allocated in pm_start_workqueue().
>
> Yeah, agreed, as I said in the original mail, freezable workqueues I am
> not covering for the moment, there might be a additional issues with them
> compared to kthreads. That needs to be further looked into.

OK

> > > The main point of freezer is to reach quiescent state wrt. filesystems
> > > (metadata in memory need to be absolutely in sync with what's on disk).
> > > That's no different between hibernation and s2ram, is it?
> > >
> > > BTW, a quite some of this has been already "pre-discussed" in
> > > Documentation/power/freezing-of-tasks.txt (which has BTW been written
> > > before we've had the possibility to freeze filesystems, and this fact is
> > > even point there out).
> >
> > That is somewhat outdated in my view. At least my list of reasons for using
> > the freezer is now somewhat different from the one given in that file.
>
> Well, so what is the list then, and how do you know that kthread_run()
> users are behaving according to that list? That's basically exactly what
> this effort is about -- making some sense out of current situation.

Currently, the #1 reason for using the freezer is to prevent user space
from interacting with devices during system suspend/resume. BTW, that also
covers devices with runtime PM support, because the handling of user space
access to them in the runtime PM case may be different.

For example, if user space does a "read" or "write" on a character device
which is runtime-suspended at that point, the driver may want to resume the
device temporarily, carry out the operation and suspend it again, but that
generally won't work for the system suspend case.

Note that this applies to both system suspend (all variants of it for that
matter) and hibernation.

The #2 reason in my view is that essentially the freezer is what makes a
difference between runtime idle and system suspend on platforms without
firmware/hardware suspend support (or the lightweight variants of suspend
in general). Namely, it effectively prevents user space from setting up
timers in the future and helps to reduce interrupt noise causing the CPUs
to leave deep low-power states too often (at least in some cases).

The #3 reason is to provide a way for things that might touch persistent
storage after a hibernation image had been created to prevent themselves from
running during that time. Note that it also is a good idea to prevent such
things from running during system suspend/resume in general so they don't
try to access devices at wrong times.

BTW, the freezing of filesystems during system suspend (not hibernation) makes
sense too, because it will help to address the long-standing issue with storage
devices that go away while the system is suspended.

I guess it may also helps to address the case when a device is removed from a
suspended system, written to on another system in the meantime and inserted
back into the (still suspended) original system which then is resumed. Today
this is an almost guaranteed data corruption scenario, but if the filesystem in
question is properly frozen during suspend, the driver should be able to detect
superblock changes during unfreeze.

So the approach I'd suggest would be to add the freezing of filesystems to the
suspend/resume code paths just for the above reasons and drop the kthreads
freezing from the filesystems that support the proper freezing. The rest
should be easier to deal with then.

Thanks,
Rafael


2015-11-02 10:45:51

by Jiri Kosina

[permalink] [raw]
Subject: Re: [PATCH 0/3] PM, vfs: use filesystem freezing instead of kthread freezer

On Mon, 2 Nov 2015, Rafael J. Wysocki wrote:

> > > > BTW, a quite some of this has been already "pre-discussed" in
> > > > Documentation/power/freezing-of-tasks.txt (which has BTW been written
> > > > before we've had the possibility to freeze filesystems, and this fact is
> > > > even point there out).
> > >
> > > That is somewhat outdated in my view. At least my list of reasons for using
> > > the freezer is now somewhat different from the one given in that file.
> >
> > Well, so what is the list then, and how do you know that kthread_run()
> > users are behaving according to that list? That's basically exactly what
> > this effort is about -- making some sense out of current situation.
>
> Currently, the #1 reason for using the freezer is to prevent user space
> from interacting with devices during system suspend/resume. BTW, that also
> covers devices with runtime PM support, because the handling of user space
> access to them in the runtime PM case may be different.
>
> For example, if user space does a "read" or "write" on a character device
> which is runtime-suspended at that point, the driver may want to resume the
> device temporarily, carry out the operation and suspend it again, but that
> generally won't work for the system suspend case.

But why would this even be relevant in this discussion, given that at the
point we are talking about, the whole userspace has been frozen already?

> The #2 reason in my view is that essentially the freezer is what makes a
> difference between runtime idle and system suspend on platforms without
> firmware/hardware suspend support (or the lightweight variants of suspend
> in general). Namely, it effectively prevents user space from setting up
> timers in the future and helps to reduce interrupt noise causing the CPUs
> to leave deep low-power states too often (at least in some cases).

I am of course at all not trying to remove freezer for userspace. This
whole discussion is solely about explicit usage of freezer in kthreads.

> The #3 reason is to provide a way for things that might touch persistent
> storage after a hibernation image had been created to prevent themselves from
> running during that time. Note that it also is a good idea to prevent such
> things from running during system suspend/resume in general so they don't
> try to access devices at wrong times.
>
> BTW, the freezing of filesystems during system suspend (not hibernation) makes
> sense too, because it will help to address the long-standing issue with storage
> devices that go away while the system is suspended.
>
> I guess it may also helps to address the case when a device is removed from a
> suspended system, written to on another system in the meantime and inserted
> back into the (still suspended) original system which then is resumed. Today
> this is an almost guaranteed data corruption scenario, but if the filesystem in
> question is properly frozen during suspend, the driver should be able to detect
> superblock changes during unfreeze.
>
> So the approach I'd suggest would be to add the freezing of filesystems to the
> suspend/resume code paths just for the above reasons and drop the kthreads
> freezing from the filesystems that support the proper freezing. The rest
> should be easier to deal with then.

That alone makes sense. It'll however leave a load of freezer users in the
kernel that make no sense (one example picked completely out of the air:
w1_process(); what is the reason for it there?) and are likely broken
(completely random examples again: md, xfsaild -- they think they are
freezable, but they are not).

Thanks,

--
Jiri Kosina
SUSE Labs

2015-11-02 11:02:17

by Oliver Neukum

[permalink] [raw]
Subject: Re: [PATCH 0/3] PM, vfs: use filesystem freezing instead of kthread freezer

On Mon, 2015-11-02 at 11:45 +0100, Jiri Kosina wrote:
> > For example, if user space does a "read" or "write" on a character
> device
> > which is runtime-suspended at that point, the driver may want to
> resume the
> > device temporarily, carry out the operation and suspend it again,
> but that
> > generally won't work for the system suspend case.
>
> But why would this even be relevant in this discussion, given that at
> the
> point we are talking about, the whole userspace has been frozen
> already?

It really doesn't matter whether the thread in question is a kernel
thread or user space. Device detection is even worse.
Kernel threads that do either of these things must stop at
defined points. You can use the freezer or go to another mechanism.
I just doubt they'd act much different in the end.

Regards
Oliver

2015-11-02 15:18:16

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH 0/3] PM, vfs: use filesystem freezing instead of kthread freezer

On Mon, 2 Nov 2015, Jiri Kosina wrote:

> On Mon, 2 Nov 2015, Rafael J. Wysocki wrote:

> > BTW, the freezing of filesystems during system suspend (not hibernation) makes
> > sense too, because it will help to address the long-standing issue with storage
> > devices that go away while the system is suspended.
> >
> > I guess it may also helps to address the case when a device is removed from a
> > suspended system, written to on another system in the meantime and inserted
> > back into the (still suspended) original system which then is resumed. Today
> > this is an almost guaranteed data corruption scenario, but if the filesystem in
> > question is properly frozen during suspend, the driver should be able to detect
> > superblock changes during unfreeze.

I agree completely; adding filesystem freezing is a great idea.

> > So the approach I'd suggest would be to add the freezing of filesystems to the
> > suspend/resume code paths just for the above reasons and drop the kthreads
> > freezing from the filesystems that support the proper freezing. The rest
> > should be easier to deal with then.
>
> That alone makes sense. It'll however leave a load of freezer users in the
> kernel that make no sense (one example picked completely out of the air:
> w1_process(); what is the reason for it there?) and are likely broken
> (completely random examples again: md, xfsaild -- they think they are
> freezable, but they are not).

If you fix just the kthreads which do this (along with an explanation
in the changelog of why the kthread is wrong), while leaving the others
alone, that ought to be acceptable.

Alan Stern

2015-11-03 00:11:10

by Dave Chinner

[permalink] [raw]
Subject: Re: [PATCH 0/3] PM, vfs: use filesystem freezing instead of kthread freezer

On Mon, Nov 02, 2015 at 03:43:07AM +0100, Rafael J. Wysocki wrote:
> I guess it may also helps to address the case when a device is removed from a
> suspended system, written to on another system in the meantime and inserted
> back into the (still suspended) original system which then is resumed. Today
> this is an almost guaranteed data corruption scenario, but if the filesystem in
> question is properly frozen during suspend, the driver should be able to detect
> superblock changes during unfreeze.

Never going to work. There is no guarantee that a write to a
filesystem by a third party device is going to change the superblock
(or any metadata in the rest of the filesystem) in any detectable
way. Hence freezing filesystems will not prevent Bad Things
Happening if you do this while your system is suspended.

Cheers,

Dave.
--
Dave Chinner
[email protected]

2015-11-03 03:36:56

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 0/3] PM, vfs: use filesystem freezing instead of kthread freezer

On Tuesday, November 03, 2015 11:10:53 AM Dave Chinner wrote:
> On Mon, Nov 02, 2015 at 03:43:07AM +0100, Rafael J. Wysocki wrote:
> > I guess it may also helps to address the case when a device is removed from a
> > suspended system, written to on another system in the meantime and inserted
> > back into the (still suspended) original system which then is resumed. Today
> > this is an almost guaranteed data corruption scenario, but if the filesystem in
> > question is properly frozen during suspend, the driver should be able to detect
> > superblock changes during unfreeze.
>
> Never going to work. There is no guarantee that a write to a
> filesystem by a third party device is going to change the superblock
> (or any metadata in the rest of the filesystem) in any detectable
> way. Hence freezing filesystems will not prevent Bad Things
> Happening if you do this while your system is suspended.

OK, thanks for the clarification.

Cheers,
Rafael

2015-11-03 09:31:42

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH 0/3] PM, vfs: use filesystem freezing instead of kthread freezer

On Tue 03-11-15 11:10:53, Dave Chinner wrote:
> On Mon, Nov 02, 2015 at 03:43:07AM +0100, Rafael J. Wysocki wrote:
> > I guess it may also helps to address the case when a device is removed from a
> > suspended system, written to on another system in the meantime and inserted
> > back into the (still suspended) original system which then is resumed. Today
> > this is an almost guaranteed data corruption scenario, but if the filesystem in
> > question is properly frozen during suspend, the driver should be able to detect
> > superblock changes during unfreeze.
>
> Never going to work. There is no guarantee that a write to a
> filesystem by a third party device is going to change the superblock
> (or any metadata in the rest of the filesystem) in any detectable
> way. Hence freezing filesystems will not prevent Bad Things
> Happening if you do this while your system is suspended.

Agreed, we should never advertise something like this works. OTOH the truth
is that e.g. in ext4 case a simple check in ext4_unfreeze() could catch 90%
of cases where user shot himself in the foot like this (i.e., ext4 driver
will update write time in superblock if it gets mounted somewhere else and
we can check whether that didn't change in ext4_unfreeze()) and refuse to
touch the filesystem... It is not 100% reliable since user could have used
e.g. debuge2fs to arbitrarily modify the filesystem but in such cases they
have to know what they are doing anyway.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2015-11-03 21:33:25

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 0/3] PM, vfs: use filesystem freezing instead of kthread freezer

Hi,

On Tue, Nov 3, 2015 at 10:31 AM, Jan Kara <[email protected]> wrote:
> On Tue 03-11-15 11:10:53, Dave Chinner wrote:
>> On Mon, Nov 02, 2015 at 03:43:07AM +0100, Rafael J. Wysocki wrote:
>> > I guess it may also helps to address the case when a device is removed from a
>> > suspended system, written to on another system in the meantime and inserted
>> > back into the (still suspended) original system which then is resumed. Today
>> > this is an almost guaranteed data corruption scenario, but if the filesystem in
>> > question is properly frozen during suspend, the driver should be able to detect
>> > superblock changes during unfreeze.
>>
>> Never going to work. There is no guarantee that a write to a
>> filesystem by a third party device is going to change the superblock
>> (or any metadata in the rest of the filesystem) in any detectable
>> way. Hence freezing filesystems will not prevent Bad Things
>> Happening if you do this while your system is suspended.
>
> Agreed, we should never advertise something like this works. OTOH the truth
> is that e.g. in ext4 case a simple check in ext4_unfreeze() could catch 90%
> of cases where user shot himself in the foot like this (i.e., ext4 driver
> will update write time in superblock if it gets mounted somewhere else and
> we can check whether that didn't change in ext4_unfreeze()) and refuse to
> touch the filesystem... It is not 100% reliable since user could have used
> e.g. debuge2fs to arbitrarily modify the filesystem but in such cases they
> have to know what they are doing anyway.

Well, my idea was to use this for sanity checking. I guess that
sanity checks here don't really hurt, do they? And if they fail,
perhaps we can just avoid touching the fs again for safety reasons?

Thanks,
Rafael