2004-03-07 14:50:14

by Pavel Machek

[permalink] [raw]
Subject: Some highmem pages still in use after shrink_all_memory()?

Hi!

For swsusp, I need to free as much memory as possible. Well, and it
would be great if no highmem pages remained, so that I would not have
to deal with that. Is that possible?
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]


2004-03-08 00:41:00

by Andrew Morton

[permalink] [raw]
Subject: Re: Some highmem pages still in use after shrink_all_memory()?

Pavel Machek <[email protected]> wrote:
>
> Hi!
>
> For swsusp, I need to free as much memory as possible. Well, and it
> would be great if no highmem pages remained, so that I would not have
> to deal with that. Is that possible?

No, it isn't. There are pagetable pages and mlocked user pages which we
cannot do anything with.

We could perhaps swap out the mlocked pages anyway if a suspend is in
progress, but the highmem pagetable pages are not presently reclaimed
by the VM.

2004-03-08 06:36:43

by Andy Isaacson

[permalink] [raw]
Subject: Re: Some highmem pages still in use after shrink_all_memory()?

On Sun, Mar 07, 2004 at 04:40:52PM -0800, Andrew Morton wrote:
> Pavel Machek <[email protected]> wrote:
> > For swsusp, I need to free as much memory as possible. Well, and it
> > would be great if no highmem pages remained, so that I would not have
> > to deal with that. Is that possible?
>
> No, it isn't. There are pagetable pages and mlocked user pages which we
> cannot do anything with.
>
> We could perhaps swap out the mlocked pages anyway if a suspend is in
> progress, but the highmem pagetable pages are not presently reclaimed
> by the VM.

Note that there are some applications for which it is a *bug* if an
mlocked page gets written out to magnetic media. (gpg, for example.)
I imagine that they'd rather lose the mapping and get a page fault on
the next reference (which they can then fix up with a new mmap and
mlock) than have precious key material written to disk.

... unless, of course, the swap device is securely encrypted a la
OpenBSD's 'sysctl vm.swapencrypt.enable'.

http://www.openbsd.org/papers/swapencrypt.ps

However, I don't see how to implement a cryptographically secure swsusp.

(The importance of this behavior is obviously dependent on your threat
model. Perhaps the Sufficiently Paranoid gpg users will simply need to
avoid using swsusp.)

-andy

2004-03-08 07:45:25

by Nigel Cunningham

[permalink] [raw]
Subject: Re: Some highmem pages still in use after shrink_all_memory()?

Hi.

On Mon, 2004-03-08 at 19:36, Andy Isaacson wrote:
> Note that there are some applications for which it is a *bug* if an
> mlocked page gets written out to magnetic media. (gpg, for example.)
> I imagine that they'd rather lose the mapping and get a page fault on
> the next reference (which they can then fix up with a new mmap and
> mlock) than have precious key material written to disk.

For such an application, we'd have to provide a mechanism to allow an
application to set/clear a page's Nosave flag. We'd probably also want
to be able to notify user space that a suspend cycle has just occurred
and the page contents are invalid.

> However, I don't see how to implement a cryptographically secure swsusp.

It would be possible with Suspend2 - one could implement a backend (page
transformer or writer) that implemented encryption and required the user
to enter a passphrase at resume time.

> (The importance of this behavior is obviously dependent on your threat
> model. Perhaps the Sufficiently Paranoid gpg users will simply need to
> avoid using swsusp.)

Yes. Or close all gpg apps before suspending?

Nigel

--
Nigel Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614.
+61 (2) 6251 7727(wk); +61 (2) 6253 0250 (home)

Evolution (n): A hypothetical process whereby infinitely improbable events occur
with alarming frequency, order arises from chaos, and no one is given credit.

2004-03-08 09:14:19

by Pavel Machek

[permalink] [raw]
Subject: Re: Some highmem pages still in use after shrink_all_memory()?

On Po 08-03-04 00:36:39, Andy Isaacson wrote:
> On Sun, Mar 07, 2004 at 04:40:52PM -0800, Andrew Morton wrote:
> > Pavel Machek <[email protected]> wrote:
> > > For swsusp, I need to free as much memory as possible. Well, and it
> > > would be great if no highmem pages remained, so that I would not have
> > > to deal with that. Is that possible?
> >
> > No, it isn't. There are pagetable pages and mlocked user pages which we
> > cannot do anything with.
> >
> > We could perhaps swap out the mlocked pages anyway if a suspend is in
> > progress, but the highmem pagetable pages are not presently reclaimed
> > by the VM.
>
> Note that there are some applications for which it is a *bug* if an
> mlocked page gets written out to magnetic media. (gpg, for example.)

Well, but that can not be solved. During suspend-to-disk, data (by
definition) go to magnetic media. We could block suspend, and we could
kill such application....
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-08 09:39:45

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Some highmem pages still in use after shrink_all_memory()?

> Note that there are some applications for which it is a *bug* if an
> mlocked page gets written out to magnetic media. (gpg, for example.)

mlock() does not guarantee things not hitting magnetic media, just as
mlock() doesn't guarantee that the physical address of a page doesn't
change. mlock guarantees that you won't get hard pagefaults and that you
have guaranteed memory for the task at hand (eg for realtime apps and
oom-critical stuff)


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2004-03-08 15:10:26

by Chris Friesen

[permalink] [raw]
Subject: Re: Some highmem pages still in use after shrink_all_memory()?

Arjan van de Ven wrote:
>>Note that there are some applications for which it is a *bug* if an
>>mlocked page gets written out to magnetic media. (gpg, for example.)
>>
>
> mlock() does not guarantee things not hitting magnetic media, just as
> mlock() doesn't guarantee that the physical address of a page doesn't
> change.

The mlock() man page sure seems to hint that they do, by explicitly
describing its use by high-security data processing as a way to keep the
information from getting to disk. There also seem to be a lot of
references on the web about using mlock() in secure programming.

Something is not right...

Chris

--
Chris Friesen | MailStop: 043/33/F10
Nortel Networks | work: (613) 765-0557
3500 Carling Avenue | fax: (613) 765-2986
Nepean, ON K2H 8E9 Canada | email: [email protected]

2004-03-08 15:17:15

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Some highmem pages still in use after shrink_all_memory()?

On Mon, Mar 08, 2004 at 10:09:47AM -0500, Chris Friesen wrote:
> Arjan van de Ven wrote:
> >>Note that there are some applications for which it is a *bug* if an
> >>mlocked page gets written out to magnetic media. (gpg, for example.)
> >>
> >
> >mlock() does not guarantee things not hitting magnetic media, just as
> >mlock() doesn't guarantee that the physical address of a page doesn't
> >change.
>
> The mlock() man page sure seems to hint that they do, by explicitly
> describing its use by high-security data processing as a way to keep the
> information from getting to disk.

... and explicitly describing that this is not a 100% thing due to suspend
etc etc.

----
mlock disables paging for the memory in the range starting at addr with
length len bytes. All pages which contain a part of the specified memory
range are guaranteed be resident in RAM when the mlock system call returns
successfully and they are guaranteed to stay in RAM until the pages are
unlocked by munlock or munlockall, until the pages are unmapped via munmap,
or until the process terminates or starts another program with exec. Child
processes do not inherit page locks across a fork.
-----

that is what it guarantees. it guarantees that you don't hard-fault.
The rest of the manpage talks about potential usages but immediatly
describes the crypto one as non-solid


Attachments:
(No filename) (1.32 kB)
(No filename) (189.00 B)
Download all attachments

2004-03-08 15:36:04

by Chris Friesen

[permalink] [raw]
Subject: Re: Some highmem pages still in use after shrink_all_memory()?

Arjan van de Ven wrote:

> that is what it guarantees. it guarantees that you don't hard-fault.
> The rest of the manpage talks about potential usages but immediatly
> describes the crypto one as non-solid

Guess I've got older manpages...mine don't have the caveats.

Chris


--
Chris Friesen | MailStop: 043/33/F10
Nortel Networks | work: (613) 765-0557
3500 Carling Avenue | fax: (613) 765-2986
Nepean, ON K2H 8E9 Canada | email: [email protected]

2004-03-08 16:36:23

by Andy Isaacson

[permalink] [raw]
Subject: Re: Some highmem pages still in use after shrink_all_memory()?

On Mon, Mar 08, 2004 at 10:39:32AM +0100, Arjan van de Ven wrote:
> > Note that there are some applications for which it is a *bug* if an
> > mlocked page gets written out to magnetic media. (gpg, for example.)
>
> mlock() does not guarantee things not hitting magnetic media, just as
> mlock() doesn't guarantee that the physical address of a page doesn't
> change. mlock guarantees that you won't get hard pagefaults and that you
> have guaranteed memory for the task at hand (eg for realtime apps and
> oom-critical stuff)

Well, that's fine -- you can certainly define mlock to have whatever
semantics you want. But the semantics that gpg depends on are
reasonable, and if mlock is changed to have other semantics, there
should be some way for apps to get the behavior that used to be
implemented by mlock (and *documented* in the mlock man page).

It's a pity that mlock doesn't take a flags argument.

-andy

2004-03-08 17:54:45

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: Some highmem pages still in use after shrink_all_memory()?

On Mon, 8 Mar 2004, Chris Friesen wrote:

> Arjan van de Ven wrote:
>
> > that is what it guarantees. it guarantees that you don't hard-fault.
> > The rest of the manpage talks about potential usages but immediatly
> > describes the crypto one as non-solid
>
> Guess I've got older manpages...mine don't have the caveats.

POSIX don't mention sample usage or caveats either, it only
states guaranteed memory residency.

2004-03-08 18:34:20

by Pavel Machek

[permalink] [raw]
Subject: Re: Some highmem pages still in use after shrink_all_memory()?

Hi!

> > > Note that there are some applications for which it is a *bug* if an
> > > mlocked page gets written out to magnetic media. (gpg, for example.)
> >
> > mlock() does not guarantee things not hitting magnetic media, just as
> > mlock() doesn't guarantee that the physical address of a page doesn't
> > change. mlock guarantees that you won't get hard pagefaults and that you
> > have guaranteed memory for the task at hand (eg for realtime apps and
> > oom-critical stuff)
>
> Well, that's fine -- you can certainly define mlock to have whatever
> semantics you want. But the semantics that gpg depends on are
> reasonable, and if mlock is changed to have other semantics, there
> should be some way for apps to get the behavior that used to be
> implemented by mlock (and *documented* in the mlock man page).
>
> It's a pity that mlock doesn't take a flags argument.

How would it help?

Block system-wide suspend because 4K are mlocked?
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-08 18:52:20

by Andy Isaacson

[permalink] [raw]
Subject: Re: Some highmem pages still in use after shrink_all_memory()?

On Mon, Mar 08, 2004 at 07:34:01PM +0100, Pavel Machek wrote:
> > > > Note that there are some applications for which it is a *bug* if an
> > > > mlocked page gets written out to magnetic media. (gpg, for example.)
> > >
> > > mlock() does not guarantee things not hitting magnetic media, just as
> > > mlock() doesn't guarantee that the physical address of a page doesn't
> > > change. mlock guarantees that you won't get hard pagefaults and that you
> > > have guaranteed memory for the task at hand (eg for realtime apps and
> > > oom-critical stuff)
> >
> > Well, that's fine -- you can certainly define mlock to have whatever
> > semantics you want. But the semantics that gpg depends on are
> > reasonable, and if mlock is changed to have other semantics, there
> > should be some way for apps to get the behavior that used to be
> > implemented by mlock (and *documented* in the mlock man page).
> >
> > It's a pity that mlock doesn't take a flags argument.
>
> How would it help?
>
> Block system-wide suspend because 4K are mlocked?

Sorry, I left too much of my train of thought implicit. I'm suggesting
that it would be cool if there were an mlockf(addr, len, ML_NOSWAP) which
would allow an app to say "do not write this page to disk or send it
over the network." If the kernel decided to evict that page (due to
doing a suspend, perhaps) it would just drop the mapping, and when the
app next used it there would be a SEGV delivered.

Alternatively, you could define a protocol for suspend to notify apps
with mlocked memory that they must clean up in preparation for a
suspend. It doesn't have to be bulletproof; you can give them one
chance, and if they don't do it just proceed with the suspend.
(Unfortunately this does violate the "never write key material to
magnetic store" semantic as well, but at least you've given the app a
chance.) Perhaps just SIGUSR1 or something?

Perhaps a better API than mlockf(addr, len, flags) would be a
mattr(addr, len, flags) with MA_NOFAULT, MA_FIXEDPHYSADDR, MA_NOSWAP...
mlock() could then be defined as mattr(addr, len, MA_NOFAULT).


I agree that all of this is beyond the scope of what you're trying to do
in swsusp. I just want to bring up the issues so that they're not
ignored.

-andy