2009-10-13 07:44:44

by Mike Frysinger

[permalink] [raw]
Subject: [PATCH] NOMMU: fix malloc performance by adding uninitialized flag

From: Jie Zhang <[email protected]>

The no-mmu code currently clears all anonymous mmap-ed memory. While this
is what we want in the default case, all memory allocation from userspace
under no-mmu has to go through this interface, including malloc() which is
allowed to return uninitialized memory. This can easily be a significant
performance slow down. So for constrained embedded systems were security
is irrelevant, allow people to avoid unnecessarily clearing memory.

Signed-off-by: Jie Zhang <[email protected]>
Signed-off-by: Robin Getz <[email protected]>
Signed-off-by: Mike Frysinger <[email protected]>
---
Documentation/nommu-mmap.txt | 21 +++++++++++++++++++++
fs/binfmt_elf_fdpic.c | 2 +-
include/asm-generic/mman-common.h | 5 +++++
init/Kconfig | 16 ++++++++++++++++
mm/nommu.c | 7 ++++---
5 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/Documentation/nommu-mmap.txt b/Documentation/nommu-mmap.txt
index b565e82..30d09e8 100644
--- a/Documentation/nommu-mmap.txt
+++ b/Documentation/nommu-mmap.txt
@@ -16,6 +16,27 @@ the CLONE_VM flag.
The behaviour is similar between the MMU and no-MMU cases, but not identical;
and it's also much more restricted in the latter case:

+ (*) Anonymous mappings - general case
+
+ Anonymous mappings are not backed by any file, and according to the
+ Linux man pages (ver 2.22 or later) contents are initialized to zero.
+
+ In the MMU case, regions are backed by arbitrary virtual pages, and the
+ contents are only mapped with physical pages and initialized to zero
+ when a read or write happens in that specific page. This spreads out
+ the time it takes to initialize the contents depending on the
+ read/write usage of the map.
+
+ In the no-MMU case, anonymous mappings are backed by physical pages,
+ and the entire map is initialized to zero at allocation time. This
+ can cause significant delays in userspace during malloc() as the C
+ library does an anonymous mapping, and the kernel is doing a memset
+ for the entire map. Since malloc's memory is not required to be
+ cleared, an (optional) flag MAP_UNINITIALIZE can be passed to the
+ kernel's do_mmap, which will not initialize the contents to zero.
+
+ uClibc supports this to provide a pretty significant speedup for malloc().
+
(*) Anonymous mapping, MAP_PRIVATE

In the MMU case: VM regions backed by arbitrary pages; copy-on-write
diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index 38502c6..85db4a4 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -380,7 +380,7 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm,
down_write(&current->mm->mmap_sem);
current->mm->start_brk = do_mmap(NULL, 0, stack_size,
PROT_READ | PROT_WRITE | PROT_EXEC,
- MAP_PRIVATE | MAP_ANONYMOUS | MAP_GROWSDOWN,
+ MAP_PRIVATE | MAP_ANONYMOUS | MAP_UNINITIALIZE | MAP_GROWSDOWN,
0);

if (IS_ERR_VALUE(current->mm->start_brk)) {
diff --git a/include/asm-generic/mman-common.h b/include/asm-generic/mman-common.h
index 5ee13b2..dddf626 100644
--- a/include/asm-generic/mman-common.h
+++ b/include/asm-generic/mman-common.h
@@ -19,6 +19,11 @@
#define MAP_TYPE 0x0f /* Mask for type of mapping */
#define MAP_FIXED 0x10 /* Interpret addr exactly */
#define MAP_ANONYMOUS 0x20 /* don't use a file */
+#ifdef CONFIG_MMAP_ALLOW_UNINITIALIZE
+# define MAP_UNINITIALIZE 0x4000000 /* For anonymous mmap, memory could be uninitialized */
+#else
+# define MAP_UNINITIALIZE 0x0 /* Don't support this flag */
+#endif

#define MS_ASYNC 1 /* sync memory asynchronously */
#define MS_INVALIDATE 2 /* invalidate the caches */
diff --git a/init/Kconfig b/init/Kconfig
index 09c5c64..ae15849 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1069,6 +1069,22 @@ config SLOB

endchoice

+config MMAP_ALLOW_UNINITIALIZE
+ bool "Allow mmaped anonymous memory to be un-initialized"
+ depends on EMBEDDED && ! MMU
+ default n
+ help
+ Normally (and according to the Linux spec) mmap'ed MAP_ANONYMOUS
+ memory has it's contents initialized to zero. This kernel option
+ gives you the option of not doing that by adding a MAP_UNINITIALIZE
+ mmap flag (which uClibc's malloc() takes takes advantage of)
+ which provides a huge performance boost.
+
+ Because of the obvious security issues, this option should only be
+ enabled on embedded devices which you control what is run in
+ userspace. Since that isn't a problem on no-MMU systems, it is
+ normally safe to say Y here.
+
config PROFILING
bool "Profiling support (EXPERIMENTAL)"
help
diff --git a/mm/nommu.c b/mm/nommu.c
index 5189b5a..b62bd9d 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1143,9 +1143,6 @@ static int do_mmap_private(struct vm_area_struct *vma,
if (ret < rlen)
memset(base + ret, 0, rlen - ret);

- } else {
- /* if it's an anonymous mapping, then just clear it */
- memset(base, 0, rlen);
}

return 0;
@@ -1343,6 +1340,10 @@ unsigned long do_mmap_pgoff(struct file *file,
goto error_just_free;
add_nommu_region(region);

+ /* clear anonymous mappings that don't ask for un-initialized data */
+ if (!(vma->vm_file) && !(flags & MAP_UNINITIALIZE))
+ memset((void *)region->vm_start, 0, region->vm_end - region->vm_start);
+
/* okay... we have a mapping; now we have to register it */
result = vma->vm_start;

--
1.6.5


2009-10-13 10:11:47

by David Howells

[permalink] [raw]
Subject: Re: [PATCH] NOMMU: fix malloc performance by adding uninitialized flag


This seems reasonable. MAP_UNINITIALIZE definitely needs adding to the
general list so that the MMU folks don't steal the bit. I would also call it
MAP_UNINITIALIZED (well, actually, I'd call it MAP_UNINITIALISED:-), otherwise
it looks like you're asking mmap() to uninitialise the memory for you.
Similarly for CONFIG_MMAP_ALLOW_UNINITIALIZE - I'd add a terminal 'D'.

I've also re-written the documenation somewhat and expanded the patch
description to mention the changes to ELF-FDPIC.

David
---
From: Jie Zhang <[email protected]>
Subject: [PATCH] NOMMU: Fix malloc performance by adding uninitialized flag

The NOMMU code currently clears all anonymous mmapped memory. While this is
what we want in the default case, all memory allocation from userspace under
NOMMU has to go through this interface, including malloc() which is allowed to
return uninitialized memory. This can easily be a significant performance
penalty. So for constrained embedded systems were security is irrelevant,
allow people to avoid clearing memory unnecessarily.

This also alters the ELF-FDPIC binfmt such that it obtains uninitialised
memory for the brk and stack region.

Signed-off-by: Jie Zhang <[email protected]>
Signed-off-by: Robin Getz <[email protected]>
Signed-off-by: Mike Frysinger <[email protected]>
Signed-off-by: David Howells <[email protected]>
---

Documentation/nommu-mmap.txt | 26 ++++++++++++++++++++++++++
fs/binfmt_elf_fdpic.c | 3 ++-
include/asm-generic/mman-common.h | 5 +++++
init/Kconfig | 22 ++++++++++++++++++++++
mm/nommu.c | 8 +++++---
5 files changed, 60 insertions(+), 4 deletions(-)


diff --git a/Documentation/nommu-mmap.txt b/Documentation/nommu-mmap.txt
index b565e82..b216ced 100644
--- a/Documentation/nommu-mmap.txt
+++ b/Documentation/nommu-mmap.txt
@@ -119,6 +119,32 @@ FURTHER NOTES ON NO-MMU MMAP
granule but will only discard the excess if appropriately configured as
this has an effect on fragmentation.

+ (*) The memory allocated by a request for an anonymous mapping will normally
+ be cleared by the kernel before being returned in accordance with the
+ Linux man pages (ver 2.22 or later)
+
+ In the MMU case this can be achieved with reasonable performance as
+ regions are backed by virtual pages, with the contents only being mapped
+ to cleared physical pages when a write happens on that specific page
+ (prior to which, the pages are effectively mapped to the global zero page
+ from which reads can take place). This spreads out the time it takes to
+ initialize the contents of a page - depending on the write-usage of the
+ mapping.
+
+ In the no-MMU case, however, anonymous mappings are backed by physical
+ pages, and the entire map is cleared at allocation time. This can cause
+ significant delays during a userspace malloc() as the C library does an
+ anonymous mapping and the kernel then does a memset for the entire map.
+
+ However, for memory that isn't required to be precleared - such as that
+ returned by malloc() - mmap() can take a MAP_UNINITIALIZE flag to indicate
+ to the kernel that it shouldn't bother clearing the memory before
+ returning it. Note that CONFIG_MMAP_ALLOW_UNINITIALIZE must be enabled to
+ permit this, otherwise the flag will be ignored.
+
+ uClibc uses this to speed up malloc(), and the ELF-FDPIC binfmt uses this
+ to allocate the brk and stack region.
+
(*) A list of all the private copy and anonymous mappings on the system is
visible through /proc/maps in no-MMU mode.

diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index e19b9bb..ab7e57b 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -380,7 +380,8 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm,
down_write(&current->mm->mmap_sem);
current->mm->start_brk = do_mmap(NULL, 0, stack_size,
PROT_READ | PROT_WRITE | PROT_EXEC,
- MAP_PRIVATE | MAP_ANONYMOUS | MAP_GROWSDOWN,
+ MAP_PRIVATE | MAP_ANONYMOUS |
+ MAP_UNINITIALIZE | MAP_GROWSDOWN,
0);

if (IS_ERR_VALUE(current->mm->start_brk)) {
diff --git a/include/asm-generic/mman-common.h b/include/asm-generic/mman-common.h
index 5ee13b2..dddf626 100644
--- a/include/asm-generic/mman-common.h
+++ b/include/asm-generic/mman-common.h
@@ -19,6 +19,11 @@
#define MAP_TYPE 0x0f /* Mask for type of mapping */
#define MAP_FIXED 0x10 /* Interpret addr exactly */
#define MAP_ANONYMOUS 0x20 /* don't use a file */
+#ifdef CONFIG_MMAP_ALLOW_UNINITIALIZE
+# define MAP_UNINITIALIZE 0x4000000 /* For anonymous mmap, memory could be uninitialized */
+#else
+# define MAP_UNINITIALIZE 0x0 /* Don't support this flag */
+#endif

#define MS_ASYNC 1 /* sync memory asynchronously */
#define MS_INVALIDATE 2 /* invalidate the caches */
diff --git a/init/Kconfig b/init/Kconfig
index 09c5c64..9c0ffd1 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1069,6 +1069,28 @@ config SLOB

endchoice

+config MMAP_ALLOW_UNINITIALIZE
+ bool "Allow mmapped anonymous memory to be un-initialized"
+ depends on EMBEDDED && !MMU
+ default n
+ help
+ Normally, and according to the Linux spec, anonymous memory obtained
+ from mmap() has it's contents cleared before it is passed to
+ userspace. Enabling this config option allows you to request that
+ mmap() skip that if it is given an MAP_UNINITIALIZE flag, thus
+ providing a huge performance boost. If this option is not enabled,
+ then the flag will be ignored.
+
+ This is taken advantage of by uClibc's malloc(), and also by
+ ELF-FDPIC binfmt's brk and stack allocator.
+
+ Because of the obvious security issues, this option should only be
+ enabled on embedded devices where you control what is run in
+ userspace. Since that isn't generally a problem on no-MMU systems,
+ it is normally safe to say Y here.
+
+ See Documentation/nommu-mmap.txt for more information.
+
config PROFILING
bool "Profiling support (EXPERIMENTAL)"
help
diff --git a/mm/nommu.c b/mm/nommu.c
index eefce2d..688f6d0 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1143,9 +1143,6 @@ static int do_mmap_private(struct vm_area_struct *vma,
if (ret < rlen)
memset(base + ret, 0, rlen - ret);

- } else {
- /* if it's an anonymous mapping, then just clear it */
- memset(base, 0, rlen);
}

return 0;
@@ -1343,6 +1340,11 @@ unsigned long do_mmap_pgoff(struct file *file,
goto error_just_free;
add_nommu_region(region);

+ /* clear anonymous mappings that don't ask for uninitialized data */
+ if (!vma->vm_file && !(flags & MAP_UNINITIALIZE))
+ memset((void *)region->vm_start, 0,
+ region->vm_end - region->vm_start);
+
/* okay... we have a mapping; now we have to register it */
result = vma->vm_start;

2009-10-13 11:21:01

by Mike Frysinger

[permalink] [raw]
Subject: [PATCH v2] NOMMU: fix malloc performance by adding uninitialized flag

From: Jie Zhang <[email protected]>

The NOMMU code currently clears all anonymous mmapped memory. While this
is what we want in the default case, all memory allocation from userspace
under NOMMU has to go through this interface, including malloc() which is
allowed to return uninitialized memory. This can easily be a significant
performance penalty. So for constrained embedded systems were security is
irrelevant, allow people to avoid clearing memory unnecessarily.

This also alters the ELF-FDPIC binfmt such that it obtains uninitialised
memory for the brk and stack region.

Signed-off-by: Jie Zhang <[email protected]>
Signed-off-by: Robin Getz <[email protected]>
Signed-off-by: Mike Frysinger <[email protected]>
Signed-off-by: David Howells <[email protected]>
---
v2
- add a 'D' suffix in define names

David: i think i slurped all your changes into this and changed all the
occurrences to have a D suffix ...

Documentation/nommu-mmap.txt | 26 ++++++++++++++++++++++++++
fs/binfmt_elf_fdpic.c | 3 ++-
include/asm-generic/mman-common.h | 5 +++++
init/Kconfig | 22 ++++++++++++++++++++++
mm/nommu.c | 8 +++++---
5 files changed, 60 insertions(+), 4 deletions(-)

diff --git a/Documentation/nommu-mmap.txt b/Documentation/nommu-mmap.txt
index b565e82..8e1ddec 100644
--- a/Documentation/nommu-mmap.txt
+++ b/Documentation/nommu-mmap.txt
@@ -119,6 +119,32 @@ FURTHER NOTES ON NO-MMU MMAP
granule but will only discard the excess if appropriately configured as
this has an effect on fragmentation.

+ (*) The memory allocated by a request for an anonymous mapping will normally
+ be cleared by the kernel before being returned in accordance with the
+ Linux man pages (ver 2.22 or later).
+
+ In the MMU case this can be achieved with reasonable performance as
+ regions are backed by virtual pages, with the contents only being mapped
+ to cleared physical pages when a write happens on that specific page
+ (prior to which, the pages are effectively mapped to the global zero page
+ from which reads can take place). This spreads out the time it takes to
+ initialize the contents of a page - depending on the write-usage of the
+ mapping.
+
+ In the no-MMU case, however, anonymous mappings are backed by physical
+ pages, and the entire map is cleared at allocation time. This can cause
+ significant delays during a userspace malloc() as the C library does an
+ anonymous mapping and the kernel then does a memset for the entire map.
+
+ However, for memory that isn't required to be precleared - such as that
+ returned by malloc() - mmap() can take a MAP_UNINITIALIZED flag to
+ indicate to the kernel that it shouldn't bother clearing the memory before
+ returning it. Note that CONFIG_MMAP_ALLOW_UNINITIALIZED must be enabled
+ to permit this, otherwise the flag will be ignored.
+
+ uClibc uses this to speed up malloc(), and the ELF-FDPIC binfmt uses this
+ to allocate the brk and stack region.
+
(*) A list of all the private copy and anonymous mappings on the system is
visible through /proc/maps in no-MMU mode.

diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index 38502c6..79d2b1a 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -380,7 +380,8 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm,
down_write(&current->mm->mmap_sem);
current->mm->start_brk = do_mmap(NULL, 0, stack_size,
PROT_READ | PROT_WRITE | PROT_EXEC,
- MAP_PRIVATE | MAP_ANONYMOUS | MAP_GROWSDOWN,
+ MAP_PRIVATE | MAP_ANONYMOUS |
+ MAP_UNINITIALIZED | MAP_GROWSDOWN,
0);

if (IS_ERR_VALUE(current->mm->start_brk)) {
diff --git a/include/asm-generic/mman-common.h b/include/asm-generic/mman-common.h
index 5ee13b2..2011126 100644
--- a/include/asm-generic/mman-common.h
+++ b/include/asm-generic/mman-common.h
@@ -19,6 +19,11 @@
#define MAP_TYPE 0x0f /* Mask for type of mapping */
#define MAP_FIXED 0x10 /* Interpret addr exactly */
#define MAP_ANONYMOUS 0x20 /* don't use a file */
+#ifdef CONFIG_MMAP_ALLOW_UNINITIALIZED
+# define MAP_UNINITIALIZED 0x4000000 /* For anonymous mmap, memory could be uninitialized */
+#else
+# define MAP_UNINITIALIZED 0x0 /* Don't support this flag */
+#endif

#define MS_ASYNC 1 /* sync memory asynchronously */
#define MS_INVALIDATE 2 /* invalidate the caches */
diff --git a/init/Kconfig b/init/Kconfig
index 09c5c64..817aeca 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1069,6 +1069,28 @@ config SLOB

endchoice

+config MMAP_ALLOW_UNINITIALIZED
+ bool "Allow mmaped anonymous memory to be un-initialized"
+ depends on EMBEDDED && !MMU
+ default n
+ help
+ Normally, and according to the Linux spec, anonymous memory obtained
+ from mmap() has it's contents cleared before it is passed to
+ userspace. Enabling this config option allows you to request that
+ mmap() skip that if it is given an MAP_UNINITIALIZED flag, thus
+ providing a huge performance boost. If this option is not enabled,
+ then the flag will be ignored.
+
+ This is taken advantage of by uClibc's malloc(), and also by
+ ELF-FDPIC binfmt's brk and stack allocator.
+
+ Because of the obvious security issues, this option should only be
+ enabled on embedded devices where you control what is run in
+ userspace. Since that isn't generally a problem on no-MMU systems,
+ it is normally safe to say Y here.
+
+ See Documentation/nommu-mmap.txt for more information.
+
config PROFILING
bool "Profiling support (EXPERIMENTAL)"
help
diff --git a/mm/nommu.c b/mm/nommu.c
index 5189b5a..11e8231 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1143,9 +1143,6 @@ static int do_mmap_private(struct vm_area_struct *vma,
if (ret < rlen)
memset(base + ret, 0, rlen - ret);

- } else {
- /* if it's an anonymous mapping, then just clear it */
- memset(base, 0, rlen);
}

return 0;
@@ -1343,6 +1340,11 @@ unsigned long do_mmap_pgoff(struct file *file,
goto error_just_free;
add_nommu_region(region);

+ /* clear anonymous mappings that don't ask for uninitialized data */
+ if (!vma->vm_file && !(flags & MAP_UNINITIALIZED))
+ memset((void *)region->vm_start, 0,
+ region->vm_end - region->vm_start);
+
/* okay... we have a mapping; now we have to register it */
result = vma->vm_start;

--
1.6.5

2009-10-13 13:05:17

by Paul Mundt

[permalink] [raw]
Subject: Re: [PATCH v2] NOMMU: fix malloc performance by adding uninitialized flag

On Tue, Oct 13, 2009 at 07:20:21AM -0400, Mike Frysinger wrote:
> From: Jie Zhang <[email protected]>
>
> The NOMMU code currently clears all anonymous mmapped memory. While this
> is what we want in the default case, all memory allocation from userspace
> under NOMMU has to go through this interface, including malloc() which is
> allowed to return uninitialized memory. This can easily be a significant
> performance penalty. So for constrained embedded systems were security is
> irrelevant, allow people to avoid clearing memory unnecessarily.
>
> This also alters the ELF-FDPIC binfmt such that it obtains uninitialised
> memory for the brk and stack region.
>
> Signed-off-by: Jie Zhang <[email protected]>
> Signed-off-by: Robin Getz <[email protected]>
> Signed-off-by: Mike Frysinger <[email protected]>
> Signed-off-by: David Howells <[email protected]>

Acked-by: Paul Mundt <[email protected]>

2009-10-13 15:21:29

by Robin Getz

[permalink] [raw]
Subject: Re: [uClinux-dev] [PATCH] NOMMU: fix malloc performance by addinguninitialized flag

On Tue 13 Oct 2009 03:44, Mike Frysinger pondered:
> From: Jie Zhang <[email protected]>
>
> The no-mmu code currently clears all anonymous mmap-ed memory. While this
> is what we want in the default case, all memory allocation from userspace
> under no-mmu has to go through this interface, including malloc() which is
> allowed to return uninitialized memory. This can easily be a significant
> performance slow down. So for constrained embedded systems were security
> is irrelevant, allow people to avoid unnecessarily clearing memory.
>
> Signed-off-by: Jie Zhang <[email protected]>
> Signed-off-by: Robin Getz <[email protected]>

Should be [email protected] -- [email protected] does not exist and will bounce.

2009-10-13 16:05:12

by David Howells

[permalink] [raw]
Subject: Re: [PATCH v2] NOMMU: fix malloc performance by adding uninitialized flag

Mike Frysinger <[email protected]> wrote:

> + bool "Allow mmaped anonymous memory to be un-initialized"

Can you change that to be 'mmapped' and 'uninitialized'?

Other than that, it looks good.

David

2009-10-13 21:31:50

by Mike Frysinger

[permalink] [raw]
Subject: [PATCH v3] NOMMU: fix malloc performance by adding uninitialized flag

From: Jie Zhang <[email protected]>

The NOMMU code currently clears all anonymous mmapped memory. While this
is what we want in the default case, all memory allocation from userspace
under NOMMU has to go through this interface, including malloc() which is
allowed to return uninitialized memory. This can easily be a significant
performance penalty. So for constrained embedded systems were security is
irrelevant, allow people to avoid clearing memory unnecessarily.

This also alters the ELF-FDPIC binfmt such that it obtains uninitialised
memory for the brk and stack region.

Signed-off-by: Jie Zhang <[email protected]>
Signed-off-by: Robin Getz <[email protected]>
Signed-off-by: Mike Frysinger <[email protected]>
Signed-off-by: David Howells <[email protected]>
Acked-by: Paul Mundt <[email protected]>
---
v3
- tweak kconfig desc

Documentation/nommu-mmap.txt | 26 ++++++++++++++++++++++++++
fs/binfmt_elf_fdpic.c | 3 ++-
include/asm-generic/mman-common.h | 5 +++++
init/Kconfig | 22 ++++++++++++++++++++++
mm/nommu.c | 8 +++++---
5 files changed, 60 insertions(+), 4 deletions(-)

diff --git a/Documentation/nommu-mmap.txt b/Documentation/nommu-mmap.txt
index b565e82..8e1ddec 100644
--- a/Documentation/nommu-mmap.txt
+++ b/Documentation/nommu-mmap.txt
@@ -119,6 +119,32 @@ FURTHER NOTES ON NO-MMU MMAP
granule but will only discard the excess if appropriately configured as
this has an effect on fragmentation.

+ (*) The memory allocated by a request for an anonymous mapping will normally
+ be cleared by the kernel before being returned in accordance with the
+ Linux man pages (ver 2.22 or later).
+
+ In the MMU case this can be achieved with reasonable performance as
+ regions are backed by virtual pages, with the contents only being mapped
+ to cleared physical pages when a write happens on that specific page
+ (prior to which, the pages are effectively mapped to the global zero page
+ from which reads can take place). This spreads out the time it takes to
+ initialize the contents of a page - depending on the write-usage of the
+ mapping.
+
+ In the no-MMU case, however, anonymous mappings are backed by physical
+ pages, and the entire map is cleared at allocation time. This can cause
+ significant delays during a userspace malloc() as the C library does an
+ anonymous mapping and the kernel then does a memset for the entire map.
+
+ However, for memory that isn't required to be precleared - such as that
+ returned by malloc() - mmap() can take a MAP_UNINITIALIZED flag to
+ indicate to the kernel that it shouldn't bother clearing the memory before
+ returning it. Note that CONFIG_MMAP_ALLOW_UNINITIALIZED must be enabled
+ to permit this, otherwise the flag will be ignored.
+
+ uClibc uses this to speed up malloc(), and the ELF-FDPIC binfmt uses this
+ to allocate the brk and stack region.
+
(*) A list of all the private copy and anonymous mappings on the system is
visible through /proc/maps in no-MMU mode.

diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
index 38502c6..79d2b1a 100644
--- a/fs/binfmt_elf_fdpic.c
+++ b/fs/binfmt_elf_fdpic.c
@@ -380,7 +380,8 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm,
down_write(&current->mm->mmap_sem);
current->mm->start_brk = do_mmap(NULL, 0, stack_size,
PROT_READ | PROT_WRITE | PROT_EXEC,
- MAP_PRIVATE | MAP_ANONYMOUS | MAP_GROWSDOWN,
+ MAP_PRIVATE | MAP_ANONYMOUS |
+ MAP_UNINITIALIZED | MAP_GROWSDOWN,
0);

if (IS_ERR_VALUE(current->mm->start_brk)) {
diff --git a/include/asm-generic/mman-common.h b/include/asm-generic/mman-common.h
index 5ee13b2..2011126 100644
--- a/include/asm-generic/mman-common.h
+++ b/include/asm-generic/mman-common.h
@@ -19,6 +19,11 @@
#define MAP_TYPE 0x0f /* Mask for type of mapping */
#define MAP_FIXED 0x10 /* Interpret addr exactly */
#define MAP_ANONYMOUS 0x20 /* don't use a file */
+#ifdef CONFIG_MMAP_ALLOW_UNINITIALIZED
+# define MAP_UNINITIALIZED 0x4000000 /* For anonymous mmap, memory could be uninitialized */
+#else
+# define MAP_UNINITIALIZED 0x0 /* Don't support this flag */
+#endif

#define MS_ASYNC 1 /* sync memory asynchronously */
#define MS_INVALIDATE 2 /* invalidate the caches */
diff --git a/init/Kconfig b/init/Kconfig
index 09c5c64..309cd9a 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1069,6 +1069,28 @@ config SLOB

endchoice

+config MMAP_ALLOW_UNINITIALIZED
+ bool "Allow mmapped anonymous memory to be uninitialized"
+ depends on EMBEDDED && !MMU
+ default n
+ help
+ Normally, and according to the Linux spec, anonymous memory obtained
+ from mmap() has it's contents cleared before it is passed to
+ userspace. Enabling this config option allows you to request that
+ mmap() skip that if it is given an MAP_UNINITIALIZED flag, thus
+ providing a huge performance boost. If this option is not enabled,
+ then the flag will be ignored.
+
+ This is taken advantage of by uClibc's malloc(), and also by
+ ELF-FDPIC binfmt's brk and stack allocator.
+
+ Because of the obvious security issues, this option should only be
+ enabled on embedded devices where you control what is run in
+ userspace. Since that isn't generally a problem on no-MMU systems,
+ it is normally safe to say Y here.
+
+ See Documentation/nommu-mmap.txt for more information.
+
config PROFILING
bool "Profiling support (EXPERIMENTAL)"
help
diff --git a/mm/nommu.c b/mm/nommu.c
index 5189b5a..11e8231 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1143,9 +1143,6 @@ static int do_mmap_private(struct vm_area_struct *vma,
if (ret < rlen)
memset(base + ret, 0, rlen - ret);

- } else {
- /* if it's an anonymous mapping, then just clear it */
- memset(base, 0, rlen);
}

return 0;
@@ -1343,6 +1340,11 @@ unsigned long do_mmap_pgoff(struct file *file,
goto error_just_free;
add_nommu_region(region);

+ /* clear anonymous mappings that don't ask for uninitialized data */
+ if (!vma->vm_file && !(flags & MAP_UNINITIALIZED))
+ memset((void *)region->vm_start, 0,
+ region->vm_end - region->vm_start);
+
/* okay... we have a mapping; now we have to register it */
result = vma->vm_start;

--
1.6.5

2009-10-13 23:15:31

by David McCullough

[permalink] [raw]
Subject: Re: [PATCH v3] NOMMU: fix malloc performance by adding uninitialized flag


Jivin Mike Frysinger lays it down ...
> From: Jie Zhang <[email protected]>
>
> The NOMMU code currently clears all anonymous mmapped memory. While this
> is what we want in the default case, all memory allocation from userspace
> under NOMMU has to go through this interface, including malloc() which is
> allowed to return uninitialized memory. This can easily be a significant
> performance penalty. So for constrained embedded systems were security is
> irrelevant, allow people to avoid clearing memory unnecessarily.
>
> This also alters the ELF-FDPIC binfmt such that it obtains uninitialised
> memory for the brk and stack region.
>
> Signed-off-by: Jie Zhang <[email protected]>
> Signed-off-by: Robin Getz <[email protected]>
> Signed-off-by: Mike Frysinger <[email protected]>
> Signed-off-by: David Howells <[email protected]>
> Acked-by: Paul Mundt <[email protected]>

Acked-by: David McCullough <[email protected]>

Cheers,
Davidm

> v3
> - tweak kconfig desc
>
> Documentation/nommu-mmap.txt | 26 ++++++++++++++++++++++++++
> fs/binfmt_elf_fdpic.c | 3 ++-
> include/asm-generic/mman-common.h | 5 +++++
> init/Kconfig | 22 ++++++++++++++++++++++
> mm/nommu.c | 8 +++++---
> 5 files changed, 60 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/nommu-mmap.txt b/Documentation/nommu-mmap.txt
> index b565e82..8e1ddec 100644
> --- a/Documentation/nommu-mmap.txt
> +++ b/Documentation/nommu-mmap.txt
> @@ -119,6 +119,32 @@ FURTHER NOTES ON NO-MMU MMAP
> granule but will only discard the excess if appropriately configured as
> this has an effect on fragmentation.
>
> + (*) The memory allocated by a request for an anonymous mapping will normally
> + be cleared by the kernel before being returned in accordance with the
> + Linux man pages (ver 2.22 or later).
> +
> + In the MMU case this can be achieved with reasonable performance as
> + regions are backed by virtual pages, with the contents only being mapped
> + to cleared physical pages when a write happens on that specific page
> + (prior to which, the pages are effectively mapped to the global zero page
> + from which reads can take place). This spreads out the time it takes to
> + initialize the contents of a page - depending on the write-usage of the
> + mapping.
> +
> + In the no-MMU case, however, anonymous mappings are backed by physical
> + pages, and the entire map is cleared at allocation time. This can cause
> + significant delays during a userspace malloc() as the C library does an
> + anonymous mapping and the kernel then does a memset for the entire map.
> +
> + However, for memory that isn't required to be precleared - such as that
> + returned by malloc() - mmap() can take a MAP_UNINITIALIZED flag to
> + indicate to the kernel that it shouldn't bother clearing the memory before
> + returning it. Note that CONFIG_MMAP_ALLOW_UNINITIALIZED must be enabled
> + to permit this, otherwise the flag will be ignored.
> +
> + uClibc uses this to speed up malloc(), and the ELF-FDPIC binfmt uses this
> + to allocate the brk and stack region.
> +
> (*) A list of all the private copy and anonymous mappings on the system is
> visible through /proc/maps in no-MMU mode.
>
> diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
> index 38502c6..79d2b1a 100644
> --- a/fs/binfmt_elf_fdpic.c
> +++ b/fs/binfmt_elf_fdpic.c
> @@ -380,7 +380,8 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm,
> down_write(&current->mm->mmap_sem);
> current->mm->start_brk = do_mmap(NULL, 0, stack_size,
> PROT_READ | PROT_WRITE | PROT_EXEC,
> - MAP_PRIVATE | MAP_ANONYMOUS | MAP_GROWSDOWN,
> + MAP_PRIVATE | MAP_ANONYMOUS |
> + MAP_UNINITIALIZED | MAP_GROWSDOWN,
> 0);
>
> if (IS_ERR_VALUE(current->mm->start_brk)) {
> diff --git a/include/asm-generic/mman-common.h b/include/asm-generic/mman-common.h
> index 5ee13b2..2011126 100644
> --- a/include/asm-generic/mman-common.h
> +++ b/include/asm-generic/mman-common.h
> @@ -19,6 +19,11 @@
> #define MAP_TYPE 0x0f /* Mask for type of mapping */
> #define MAP_FIXED 0x10 /* Interpret addr exactly */
> #define MAP_ANONYMOUS 0x20 /* don't use a file */
> +#ifdef CONFIG_MMAP_ALLOW_UNINITIALIZED
> +# define MAP_UNINITIALIZED 0x4000000 /* For anonymous mmap, memory could be uninitialized */
> +#else
> +# define MAP_UNINITIALIZED 0x0 /* Don't support this flag */
> +#endif
>
> #define MS_ASYNC 1 /* sync memory asynchronously */
> #define MS_INVALIDATE 2 /* invalidate the caches */
> diff --git a/init/Kconfig b/init/Kconfig
> index 09c5c64..309cd9a 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1069,6 +1069,28 @@ config SLOB
>
> endchoice
>
> +config MMAP_ALLOW_UNINITIALIZED
> + bool "Allow mmapped anonymous memory to be uninitialized"
> + depends on EMBEDDED && !MMU
> + default n
> + help
> + Normally, and according to the Linux spec, anonymous memory obtained
> + from mmap() has it's contents cleared before it is passed to
> + userspace. Enabling this config option allows you to request that
> + mmap() skip that if it is given an MAP_UNINITIALIZED flag, thus
> + providing a huge performance boost. If this option is not enabled,
> + then the flag will be ignored.
> +
> + This is taken advantage of by uClibc's malloc(), and also by
> + ELF-FDPIC binfmt's brk and stack allocator.
> +
> + Because of the obvious security issues, this option should only be
> + enabled on embedded devices where you control what is run in
> + userspace. Since that isn't generally a problem on no-MMU systems,
> + it is normally safe to say Y here.
> +
> + See Documentation/nommu-mmap.txt for more information.
> +
> config PROFILING
> bool "Profiling support (EXPERIMENTAL)"
> help
> diff --git a/mm/nommu.c b/mm/nommu.c
> index 5189b5a..11e8231 100644
> --- a/mm/nommu.c
> +++ b/mm/nommu.c
> @@ -1143,9 +1143,6 @@ static int do_mmap_private(struct vm_area_struct *vma,
> if (ret < rlen)
> memset(base + ret, 0, rlen - ret);
>
> - } else {
> - /* if it's an anonymous mapping, then just clear it */
> - memset(base, 0, rlen);
> }
>
> return 0;
> @@ -1343,6 +1340,11 @@ unsigned long do_mmap_pgoff(struct file *file,
> goto error_just_free;
> add_nommu_region(region);
>
> + /* clear anonymous mappings that don't ask for uninitialized data */
> + if (!vma->vm_file && !(flags & MAP_UNINITIALIZED))
> + memset((void *)region->vm_start, 0,
> + region->vm_end - region->vm_start);
> +
> /* okay... we have a mapping; now we have to register it */
> result = vma->vm_start;
>
> --
> 1.6.5
>
>
>

--
David McCullough, [email protected], Ph:+61 734352815
McAfee - SnapGear http://www.snapgear.com http://www.uCdot.org

2009-10-14 00:27:05

by Greg Ungerer

[permalink] [raw]
Subject: Re: [PATCH v3] NOMMU: fix malloc performance by adding uninitialized flag


Mike Frysinger wrote:
> From: Jie Zhang <[email protected]>
>
> The NOMMU code currently clears all anonymous mmapped memory. While this
> is what we want in the default case, all memory allocation from userspace
> under NOMMU has to go through this interface, including malloc() which is
> allowed to return uninitialized memory. This can easily be a significant
> performance penalty. So for constrained embedded systems were security is
> irrelevant, allow people to avoid clearing memory unnecessarily.
>
> This also alters the ELF-FDPIC binfmt such that it obtains uninitialised
> memory for the brk and stack region.
>
> Signed-off-by: Jie Zhang <[email protected]>
> Signed-off-by: Robin Getz <[email protected]>
> Signed-off-by: Mike Frysinger <[email protected]>
> Signed-off-by: David Howells <[email protected]>
> Acked-by: Paul Mundt <[email protected]>

Acked-by: Greg Ungerer <[email protected]>


> ---
> v3
> - tweak kconfig desc
>
> Documentation/nommu-mmap.txt | 26 ++++++++++++++++++++++++++
> fs/binfmt_elf_fdpic.c | 3 ++-
> include/asm-generic/mman-common.h | 5 +++++
> init/Kconfig | 22 ++++++++++++++++++++++
> mm/nommu.c | 8 +++++---
> 5 files changed, 60 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/nommu-mmap.txt b/Documentation/nommu-mmap.txt
> index b565e82..8e1ddec 100644
> --- a/Documentation/nommu-mmap.txt
> +++ b/Documentation/nommu-mmap.txt
> @@ -119,6 +119,32 @@ FURTHER NOTES ON NO-MMU MMAP
> granule but will only discard the excess if appropriately configured as
> this has an effect on fragmentation.
>
> + (*) The memory allocated by a request for an anonymous mapping will normally
> + be cleared by the kernel before being returned in accordance with the
> + Linux man pages (ver 2.22 or later).
> +
> + In the MMU case this can be achieved with reasonable performance as
> + regions are backed by virtual pages, with the contents only being mapped
> + to cleared physical pages when a write happens on that specific page
> + (prior to which, the pages are effectively mapped to the global zero page
> + from which reads can take place). This spreads out the time it takes to
> + initialize the contents of a page - depending on the write-usage of the
> + mapping.
> +
> + In the no-MMU case, however, anonymous mappings are backed by physical
> + pages, and the entire map is cleared at allocation time. This can cause
> + significant delays during a userspace malloc() as the C library does an
> + anonymous mapping and the kernel then does a memset for the entire map.
> +
> + However, for memory that isn't required to be precleared - such as that
> + returned by malloc() - mmap() can take a MAP_UNINITIALIZED flag to
> + indicate to the kernel that it shouldn't bother clearing the memory before
> + returning it. Note that CONFIG_MMAP_ALLOW_UNINITIALIZED must be enabled
> + to permit this, otherwise the flag will be ignored.
> +
> + uClibc uses this to speed up malloc(), and the ELF-FDPIC binfmt uses this
> + to allocate the brk and stack region.
> +
> (*) A list of all the private copy and anonymous mappings on the system is
> visible through /proc/maps in no-MMU mode.
>
> diff --git a/fs/binfmt_elf_fdpic.c b/fs/binfmt_elf_fdpic.c
> index 38502c6..79d2b1a 100644
> --- a/fs/binfmt_elf_fdpic.c
> +++ b/fs/binfmt_elf_fdpic.c
> @@ -380,7 +380,8 @@ static int load_elf_fdpic_binary(struct linux_binprm *bprm,
> down_write(&current->mm->mmap_sem);
> current->mm->start_brk = do_mmap(NULL, 0, stack_size,
> PROT_READ | PROT_WRITE | PROT_EXEC,
> - MAP_PRIVATE | MAP_ANONYMOUS | MAP_GROWSDOWN,
> + MAP_PRIVATE | MAP_ANONYMOUS |
> + MAP_UNINITIALIZED | MAP_GROWSDOWN,
> 0);
>
> if (IS_ERR_VALUE(current->mm->start_brk)) {
> diff --git a/include/asm-generic/mman-common.h b/include/asm-generic/mman-common.h
> index 5ee13b2..2011126 100644
> --- a/include/asm-generic/mman-common.h
> +++ b/include/asm-generic/mman-common.h
> @@ -19,6 +19,11 @@
> #define MAP_TYPE 0x0f /* Mask for type of mapping */
> #define MAP_FIXED 0x10 /* Interpret addr exactly */
> #define MAP_ANONYMOUS 0x20 /* don't use a file */
> +#ifdef CONFIG_MMAP_ALLOW_UNINITIALIZED
> +# define MAP_UNINITIALIZED 0x4000000 /* For anonymous mmap, memory could be uninitialized */
> +#else
> +# define MAP_UNINITIALIZED 0x0 /* Don't support this flag */
> +#endif
>
> #define MS_ASYNC 1 /* sync memory asynchronously */
> #define MS_INVALIDATE 2 /* invalidate the caches */
> diff --git a/init/Kconfig b/init/Kconfig
> index 09c5c64..309cd9a 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1069,6 +1069,28 @@ config SLOB
>
> endchoice
>
> +config MMAP_ALLOW_UNINITIALIZED
> + bool "Allow mmapped anonymous memory to be uninitialized"
> + depends on EMBEDDED && !MMU
> + default n
> + help
> + Normally, and according to the Linux spec, anonymous memory obtained
> + from mmap() has it's contents cleared before it is passed to
> + userspace. Enabling this config option allows you to request that
> + mmap() skip that if it is given an MAP_UNINITIALIZED flag, thus
> + providing a huge performance boost. If this option is not enabled,
> + then the flag will be ignored.
> +
> + This is taken advantage of by uClibc's malloc(), and also by
> + ELF-FDPIC binfmt's brk and stack allocator.
> +
> + Because of the obvious security issues, this option should only be
> + enabled on embedded devices where you control what is run in
> + userspace. Since that isn't generally a problem on no-MMU systems,
> + it is normally safe to say Y here.
> +
> + See Documentation/nommu-mmap.txt for more information.
> +
> config PROFILING
> bool "Profiling support (EXPERIMENTAL)"
> help
> diff --git a/mm/nommu.c b/mm/nommu.c
> index 5189b5a..11e8231 100644
> --- a/mm/nommu.c
> +++ b/mm/nommu.c
> @@ -1143,9 +1143,6 @@ static int do_mmap_private(struct vm_area_struct *vma,
> if (ret < rlen)
> memset(base + ret, 0, rlen - ret);
>
> - } else {
> - /* if it's an anonymous mapping, then just clear it */
> - memset(base, 0, rlen);
> }
>
> return 0;
> @@ -1343,6 +1340,11 @@ unsigned long do_mmap_pgoff(struct file *file,
> goto error_just_free;
> add_nommu_region(region);
>
> + /* clear anonymous mappings that don't ask for uninitialized data */
> + if (!vma->vm_file && !(flags & MAP_UNINITIALIZED))
> + memset((void *)region->vm_start, 0,
> + region->vm_end - region->vm_start);
> +
> /* okay... we have a mapping; now we have to register it */
> result = vma->vm_start;
>

--
------------------------------------------------------------------------
Greg Ungerer -- Principal Engineer EMAIL: [email protected]
SnapGear Group, McAfee PHONE: +61 7 3435 2888
825 Stanley St, FAX: +61 7 3891 3630
Woolloongabba, QLD, 4102, Australia WEB: http://www.SnapGear.com