2009-10-06 09:51:41

by Gleb Natapov

[permalink] [raw]
Subject: [PATCH][RFC] add MAP_UNLOCKED mmap flag

If application does mlockall(MCL_FUTURE) it is no longer possible to
mmap file bigger than main memory or allocate big area of anonymous
memory. Sometimes it is desirable to lock everything related to program
execution into memory, but still be able to mmap big file or allocate
huge amount of memory and allow OS to swap them on demand. MAP_UNLOCKED
allows to do that.

Signed-off-by: Gleb Natapov <[email protected]>
diff --git a/include/asm-generic/mman.h b/include/asm-generic/mman.h
index 32c8bd6..0ab4c74 100644
--- a/include/asm-generic/mman.h
+++ b/include/asm-generic/mman.h
@@ -12,6 +12,7 @@
#define MAP_NONBLOCK 0x10000 /* do not block on IO */
#define MAP_STACK 0x20000 /* give out an address that is best suited for process/thread stacks */
#define MAP_HUGETLB 0x40000 /* create a huge page mapping */
+#define MAP_UNLOKED 0x80000 /* pages are unlocked */

#define MCL_CURRENT 1 /* lock all current mappings */
#define MCL_FUTURE 2 /* lock all future mappings */
diff --git a/mm/mmap.c b/mm/mmap.c
index 73f5e4b..7c2abdb 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -985,6 +985,9 @@ unsigned long do_mmap_pgoff(struct file *file, unsigned long addr,
if (!can_do_mlock())
return -EPERM;

+ if (flags & MAP_UNLOKED)
+ vm_flags &= ~VM_LOCKED;
+
/* mlock MCL_FUTURE? */
if (vm_flags & VM_LOCKED) {
unsigned long locked, lock_limit;
--
Gleb.


2009-10-06 10:16:34

by Frederik Deweerdt

[permalink] [raw]
Subject: Re: [PATCH][RFC] add MAP_UNLOCKED mmap flag

Hello Gleb,

On Tue, Oct 06, 2009 at 11:51:11AM +0200, Gleb Natapov wrote:
> If application does mlockall(MCL_FUTURE) it is no longer possible to
> mmap file bigger than main memory or allocate big area of anonymous
> memory. Sometimes it is desirable to lock everything related to program
> execution into memory, but still be able to mmap big file or allocate
> huge amount of memory and allow OS to swap them on demand. MAP_UNLOCKED
> allows to do that.
>
> Signed-off-by: Gleb Natapov <[email protected]>
> diff --git a/include/asm-generic/mman.h b/include/asm-generic/mman.h
> index 32c8bd6..0ab4c74 100644
> --- a/include/asm-generic/mman.h
> +++ b/include/asm-generic/mman.h
> @@ -12,6 +12,7 @@
> #define MAP_NONBLOCK 0x10000 /* do not block on IO */
> #define MAP_STACK 0x20000 /* give out an address that is best suited for process/thread stacks */
> #define MAP_HUGETLB 0x40000 /* create a huge page mapping */
> +#define MAP_UNLOKED 0x80000 /* pages are unlocked */
^^^
You're missing a 'C' here and below. Also '/* force page unlocking */'
seems a better comment?

Regards,
Frederik
>
> #define MCL_CURRENT 1 /* lock all current mappings */
> #define MCL_FUTURE 2 /* lock all future mappings */
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 73f5e4b..7c2abdb 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -985,6 +985,9 @@ unsigned long do_mmap_pgoff(struct file *file, unsigned long addr,
> if (!can_do_mlock())
> return -EPERM;
>
> + if (flags & MAP_UNLOKED)
> + vm_flags &= ~VM_LOCKED;
> +
> /* mlock MCL_FUTURE? */
> if (vm_flags & VM_LOCKED) {
> unsigned long locked, lock_limit;
> --
> Gleb.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2009-10-06 10:11:49

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH][RFC] add MAP_UNLOCKED mmap flag

Hi

> If application does mlockall(MCL_FUTURE) it is no longer possible to
> mmap file bigger than main memory or allocate big area of anonymous
> memory. Sometimes it is desirable to lock everything related to program
> execution into memory, but still be able to mmap big file or allocate
> huge amount of memory and allow OS to swap them on demand. MAP_UNLOCKED
> allows to do that.
>
> Signed-off-by: Gleb Natapov <[email protected]>

Why don't you use explicit munlock()?
Plus, Can you please elabrate which workload nedd this feature?


2009-10-06 10:22:10

by Gleb Natapov

[permalink] [raw]
Subject: Re: [PATCH][RFC] add MAP_UNLOCKED mmap flag

On Tue, Oct 06, 2009 at 07:11:06PM +0900, KOSAKI Motohiro wrote:
> Hi
>
> > If application does mlockall(MCL_FUTURE) it is no longer possible to
> > mmap file bigger than main memory or allocate big area of anonymous
> > memory. Sometimes it is desirable to lock everything related to program
> > execution into memory, but still be able to mmap big file or allocate
> > huge amount of memory and allow OS to swap them on demand. MAP_UNLOCKED
> > allows to do that.
> >
> > Signed-off-by: Gleb Natapov <[email protected]>
>
> Why don't you use explicit munlock()?
Because mmap will fail before I'll have a chance to run munlock on it.
Actually when I run my process inside memory limited container host dies
(I suppose trashing, but haven't checked).

> Plus, Can you please elabrate which workload nedd this feature?
>
I wanted to run kvm with qemu process locked in memory, but guest memory
unlocked. And guest memory is bigger then host memory in the case I am
testing. I found out that it is impossible currently.

--
Gleb.

2009-10-06 10:28:39

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH][RFC] add MAP_UNLOCKED mmap flag

> On Tue, Oct 06, 2009 at 07:11:06PM +0900, KOSAKI Motohiro wrote:
> > Hi
> >
> > > If application does mlockall(MCL_FUTURE) it is no longer possible to
> > > mmap file bigger than main memory or allocate big area of anonymous
> > > memory. Sometimes it is desirable to lock everything related to program
> > > execution into memory, but still be able to mmap big file or allocate
> > > huge amount of memory and allow OS to swap them on demand. MAP_UNLOCKED
> > > allows to do that.
> > >
> > > Signed-off-by: Gleb Natapov <[email protected]>
> >
> > Why don't you use explicit munlock()?
> Because mmap will fail before I'll have a chance to run munlock on it.
> Actually when I run my process inside memory limited container host dies
> (I suppose trashing, but haven't checked).
>
> > Plus, Can you please elabrate which workload nedd this feature?
> >
> I wanted to run kvm with qemu process locked in memory, but guest memory
> unlocked. And guest memory is bigger then host memory in the case I am
> testing. I found out that it is impossible currently.

1. process creation (qemu)
2. load all library
3. mlockall(MCL_CURRENT)
4. load guest OS

is impossible? why?



2009-10-06 10:33:32

by Gleb Natapov

[permalink] [raw]
Subject: Re: [PATCH][RFC] add MAP_UNLOCKED mmap flag

On Tue, Oct 06, 2009 at 07:27:56PM +0900, KOSAKI Motohiro wrote:
> > On Tue, Oct 06, 2009 at 07:11:06PM +0900, KOSAKI Motohiro wrote:
> > > Hi
> > >
> > > > If application does mlockall(MCL_FUTURE) it is no longer possible to
> > > > mmap file bigger than main memory or allocate big area of anonymous
> > > > memory. Sometimes it is desirable to lock everything related to program
> > > > execution into memory, but still be able to mmap big file or allocate
> > > > huge amount of memory and allow OS to swap them on demand. MAP_UNLOCKED
> > > > allows to do that.
> > > >
> > > > Signed-off-by: Gleb Natapov <[email protected]>
> > >
> > > Why don't you use explicit munlock()?
> > Because mmap will fail before I'll have a chance to run munlock on it.
> > Actually when I run my process inside memory limited container host dies
> > (I suppose trashing, but haven't checked).
> >
> > > Plus, Can you please elabrate which workload nedd this feature?
> > >
> > I wanted to run kvm with qemu process locked in memory, but guest memory
> > unlocked. And guest memory is bigger then host memory in the case I am
> > testing. I found out that it is impossible currently.
>
> 1. process creation (qemu)
> 2. load all library
Can't control this if program has plugging. Not qemu case
though.

> 3. mlockall(MCL_CURRENT)
> 4. load guest OS
And what about all other allocations qemu does during its life time? Not
all of them will be small enough to be from brk area.

>
> is impossible? why?
>
Because what you are proposing is not the same as mlockall(MCL_CURRENT|MCL_FUTURE);
You essentially say that MCL_FUTURE is not needed.

--
Gleb.

2009-10-06 11:00:34

by Gleb Natapov

[permalink] [raw]
Subject: Re: [PATCH][RFC] add MAP_UNLOCKED mmap flag

On Tue, Oct 06, 2009 at 12:49:52PM +0200, Arnd Bergmann wrote:
> On Tuesday 06 October 2009, Gleb Natapov wrote:
> > If application does mlockall(MCL_FUTURE) it is no longer possible to
> > mmap file bigger than main memory or allocate big area of anonymous
> > memory. Sometimes it is desirable to lock everything related to program
> > execution into memory, but still be able to mmap big file or allocate
> > huge amount of memory and allow OS to swap them on demand. MAP_UNLOCKED
> > allows to do that.
> >
> > Signed-off-by: Gleb Natapov <[email protected]>
> > diff --git a/include/asm-generic/mman.h b/include/asm-generic/mman.h
> > index 32c8bd6..0ab4c74 100644
> > --- a/include/asm-generic/mman.h
> > +++ b/include/asm-generic/mman.h
> > @@ -12,6 +12,7 @@
> > #define MAP_NONBLOCK 0x10000 /* do not block on IO */
> > #define MAP_STACK 0x20000 /* give out an address that is best suited for process/thread stacks */
> > #define MAP_HUGETLB 0x40000 /* create a huge page mapping */
> > +#define MAP_UNLOKED 0x80000 /* pages are unlocked */
> >
> > #define MCL_CURRENT 1 /* lock all current mappings */
> > #define MCL_FUTURE 2 /* lock all future mappings */
>
> Not all architectures use asm-generic/mman.h, so you have to change
> the other architectures separately if you add a flag.
>
Ah, good to know. Thanks.

--
Gleb.

2009-10-06 12:11:14

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH][RFC] add MAP_UNLOCKED mmap flag

2009/10/6 Gleb Natapov <[email protected]>:
> On Tue, Oct 06, 2009 at 07:27:56PM +0900, KOSAKI Motohiro wrote:
>> > On Tue, Oct 06, 2009 at 07:11:06PM +0900, KOSAKI Motohiro wrote:
>> > > Hi
>> > >
>> > > > If application does mlockall(MCL_FUTURE) it is no longer possible to
>> > > > mmap file bigger than main memory or allocate big area of anonymous
>> > > > memory. Sometimes it is desirable to lock everything related to program
>> > > > execution into memory, but still be able to mmap big file or allocate
>> > > > huge amount of memory and allow OS to swap them on demand. MAP_UNLOCKED
>> > > > allows to do that.
>> > > >
>> > > > Signed-off-by: Gleb Natapov <[email protected]>
>> > >
>> > > Why don't you use explicit munlock()?
>> > Because mmap will fail before I'll have a chance to run munlock on it.
>> > Actually when I run my process inside memory limited container host dies
>> > (I suppose trashing, but haven't checked).
>> >
>> > > Plus, Can you please elabrate which workload nedd this feature?
>> > >
>> > I wanted to run kvm with qemu process locked in memory, but guest memory
>> > unlocked. And guest memory is bigger then host memory in the case I am
>> > testing. I found out that it is impossible currently.
>>
>> 1. process creation (qemu)
>> 2. load all library
> Can't control this if program has plugging. Not qemu case
> though.
>
>> 3. mlockall(MCL_CURRENT)
>> 4. load guest OS
> And what about all other allocations qemu does during its life time? Not
> all of them will be small enough to be from brk area.
>
>> is impossible? why?
>>
> Because what you are proposing is not the same as mlockall(MCL_CURRENT|MCL_FUTURE);
>
> You essentially say that MCL_FUTURE is not needed.

No, I only think your case doesn't fit MC_FUTURE.
I haven't find any real benefit in this patch.

2009-10-06 12:16:34

by Gleb Natapov

[permalink] [raw]
Subject: Re: [PATCH][RFC] add MAP_UNLOCKED mmap flag

On Tue, Oct 06, 2009 at 09:10:35PM +0900, KOSAKI Motohiro wrote:
> 2009/10/6 Gleb Natapov <[email protected]>:
> > On Tue, Oct 06, 2009 at 07:27:56PM +0900, KOSAKI Motohiro wrote:
> >> > On Tue, Oct 06, 2009 at 07:11:06PM +0900, KOSAKI Motohiro wrote:
> >> > > Hi
> >> > >
> >> > > > If application does mlockall(MCL_FUTURE) it is no longer possible to
> >> > > > mmap file bigger than main memory or allocate big area of anonymous
> >> > > > memory. Sometimes it is desirable to lock everything related to program
> >> > > > execution into memory, but still be able to mmap big file or allocate
> >> > > > huge amount of memory and allow OS to swap them on demand. MAP_UNLOCKED
> >> > > > allows to do that.
> >> > > >
> >> > > > Signed-off-by: Gleb Natapov <[email protected]>
> >> > >
> >> > > Why don't you use explicit munlock()?
> >> > Because mmap will fail before I'll have a chance to run munlock on it.
> >> > Actually when I run my process inside memory limited container host dies
> >> > (I suppose trashing, but haven't checked).
> >> >
> >> > > Plus, Can you please elabrate which workload nedd this feature?
> >> > >
> >> > I wanted to run kvm with qemu process locked in memory, but guest memory
> >> > unlocked. And guest memory is bigger then host memory in the case I am
> >> > testing. I found out that it is impossible currently.
> >>
> >> 1. process creation (qemu)
> >> 2. load all library
> > Can't control this if program has plugging. Not qemu case
> > though.
> >
> >> 3. mlockall(MCL_CURRENT)
> >> 4. load guest OS
> > And what about all other allocations qemu does during its life time? Not
> > all of them will be small enough to be from brk area.
> >
> >> is impossible? why?
> >>
> > Because what you are proposing is not the same as mlockall(MCL_CURRENT|MCL_FUTURE);
> >
> > You essentially say that MCL_FUTURE is not needed.
>
> No, I only think your case doesn't fit MC_FUTURE.
> I haven't find any real benefit in this patch.
I did. It allows me to achieve something I can't now. Steps you provide
just don't fit my needs. I need all memory areas (current and feature) to be
locked except one. Very big one. You propose to lock memory at some
arbitrary point and from that point on all newly mapped memory areas will
be unlocked. Don't you see it is different?

--
Gleb.

2009-10-06 13:50:34

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH][RFC] add MAP_UNLOCKED mmap flag

On Tue, 2009-10-06 at 14:16 +0200, Gleb Natapov wrote:
> > No, I only think your case doesn't fit MC_FUTURE.
> > I haven't find any real benefit in this patch.

> I did. It allows me to achieve something I can't now. Steps you provide
> just don't fit my needs. I need all memory areas (current and feature) to be
> locked except one. Very big one. You propose to lock memory at some
> arbitrary point and from that point on all newly mapped memory areas will
> be unlocked. Don't you see it is different?

While true, it does demonstrates very sloppy programming. The proper fix
is to rework qemu to mlock what is needed.

I'm not sure encouraging mlockall() usage is a good thing. When using
resource locks one had better know what he's doing. mlockall() doesn't
promote caution.

2009-10-06 14:07:21

by Gleb Natapov

[permalink] [raw]
Subject: Re: [PATCH][RFC] add MAP_UNLOCKED mmap flag

On Tue, Oct 06, 2009 at 03:50:03PM +0200, Peter Zijlstra wrote:
> On Tue, 2009-10-06 at 14:16 +0200, Gleb Natapov wrote:
> > > No, I only think your case doesn't fit MC_FUTURE.
> > > I haven't find any real benefit in this patch.
>
> > I did. It allows me to achieve something I can't now. Steps you provide
> > just don't fit my needs. I need all memory areas (current and feature) to be
> > locked except one. Very big one. You propose to lock memory at some
> > arbitrary point and from that point on all newly mapped memory areas will
> > be unlocked. Don't you see it is different?
>
> While true, it does demonstrates very sloppy programming. The proper fix
> is to rework qemu to mlock what is needed.
>
So you are saying for application (any application forget about qemu) to lock
everything except one memory region it needs to provide its own memory allocation
routings and its own dynamic linker? BTW the interface is not symmetric currently.
Application may mmap single memory area locked (MAP_LOCKED), but can't do reverse
if mlockall(MC_FUTURE) was called.

> I'm not sure encouraging mlockall() usage is a good thing. When using
This is up to application programmer to decide whether he wants to use
mlockall() or not. May be he has a good reason do so. As it stands the
existing interface doesn't allow to do what I need without rewriting
libc memory allocator and dynamic linking loader.

> resource locks one had better know what he's doing. mlockall() doesn't
> promote caution.
No need to patronize userspace developers. Lets provide them with
flexible interface and if they'll use it inappropriately we will not use
their software.

--
Gleb.

2009-10-07 19:02:38

by Olivier Galibert

[permalink] [raw]
Subject: Re: [PATCH][RFC] add MAP_UNLOCKED mmap flag

On Tue, Oct 06, 2009 at 02:16:03PM +0200, Gleb Natapov wrote:
> I did. It allows me to achieve something I can't now. Steps you provide
> just don't fit my needs. I need all memory areas (current and feature) to be
> locked except one. Very big one. You propose to lock memory at some
> arbitrary point and from that point on all newly mapped memory areas will
> be unlocked. Don't you see it is different?

What about mlockall(MCL_CURRENT); mmap(...); mlockall(MCL_FUTURE);?
Or toggle MCL_FUTURE if a mlockall call can stop it?

OG.

2009-10-07 19:00:26

by Gleb Natapov

[permalink] [raw]
Subject: Re: [PATCH][RFC] add MAP_UNLOCKED mmap flag

On Wed, Oct 07, 2009 at 08:50:54PM +0200, Olivier Galibert wrote:
> On Tue, Oct 06, 2009 at 02:16:03PM +0200, Gleb Natapov wrote:
> > I did. It allows me to achieve something I can't now. Steps you provide
> > just don't fit my needs. I need all memory areas (current and feature) to be
> > locked except one. Very big one. You propose to lock memory at some
> > arbitrary point and from that point on all newly mapped memory areas will
> > be unlocked. Don't you see it is different?
>
> What about mlockall(MCL_CURRENT); mmap(...); mlockall(MCL_FUTURE);?
> Or toggle MCL_FUTURE if a mlockall call can stop it?
>
This may work. And MCL_FUTURE can be toggled, but this is not thread
safe.

--
Gleb.

2009-10-07 20:11:10

by Olivier Galibert

[permalink] [raw]
Subject: Re: [PATCH][RFC] add MAP_UNLOCKED mmap flag

On Wed, Oct 07, 2009 at 08:59:52PM +0200, Gleb Natapov wrote:
> On Wed, Oct 07, 2009 at 08:50:54PM +0200, Olivier Galibert wrote:
> > On Tue, Oct 06, 2009 at 02:16:03PM +0200, Gleb Natapov wrote:
> > > I did. It allows me to achieve something I can't now. Steps you provide
> > > just don't fit my needs. I need all memory areas (current and feature) to be
> > > locked except one. Very big one. You propose to lock memory at some
> > > arbitrary point and from that point on all newly mapped memory areas will
> > > be unlocked. Don't you see it is different?
> >
> > What about mlockall(MCL_CURRENT); mmap(...); mlockall(MCL_FUTURE);?
> > Or toggle MCL_FUTURE if a mlockall call can stop it?
> >
> This may work. And MCL_FUTURE can be toggled, but this is not thread
> safe.

Just ensure that your one special mmap is done with the other threads
not currently allocating stuff. It's probably a synchronization point
for the whole process anyway.

OG.

2009-10-07 20:47:52

by Gleb Natapov

[permalink] [raw]
Subject: Re: [PATCH][RFC] add MAP_UNLOCKED mmap flag

On Wed, Oct 07, 2009 at 10:10:17PM +0200, Olivier Galibert wrote:
> On Wed, Oct 07, 2009 at 08:59:52PM +0200, Gleb Natapov wrote:
> > On Wed, Oct 07, 2009 at 08:50:54PM +0200, Olivier Galibert wrote:
> > > On Tue, Oct 06, 2009 at 02:16:03PM +0200, Gleb Natapov wrote:
> > > > I did. It allows me to achieve something I can't now. Steps you provide
> > > > just don't fit my needs. I need all memory areas (current and feature) to be
> > > > locked except one. Very big one. You propose to lock memory at some
> > > > arbitrary point and from that point on all newly mapped memory areas will
> > > > be unlocked. Don't you see it is different?
> > >
> > > What about mlockall(MCL_CURRENT); mmap(...); mlockall(MCL_FUTURE);?
> > > Or toggle MCL_FUTURE if a mlockall call can stop it?
> > >
> > This may work. And MCL_FUTURE can be toggled, but this is not thread
> > safe.
>
> Just ensure that your one special mmap is done with the other threads
> not currently allocating stuff. It's probably a synchronization point
> for the whole process anyway.
>
How can you stop other threads and libraries from calling malloc()? And if
it is two special allocations? Or many mmap(big file)/munmap(big file)?
This is the same issue as opening file CLOEXEC atomically. Why not
prevent other thread from calling fork() instead of adding flags to
bunch of system calls.

--
Gleb.