2022-08-09 19:06:05

by Rik van Riel

[permalink] [raw]
Subject: [PATCH v2] mm: align larger anonymous mappings on THP boundaries

Align larger anonymous memory mappings on THP boundaries by
going through thp_get_unmapped_area if THPs are enabled for
the current process.

With this patch, larger anonymous mappings are now THP aligned.
When a malloc library allocates a 2MB or larger arena, that
arena can now be mapped with THPs right from the start, which
can result in better TLB hit rates and execution time.

Signed-off-by: Rik van Riel <[email protected]>
---
v2: avoid the chicken & egg issue with MMF_VM_HUGEPAGE (Yang Shi)

mm/mmap.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/mm/mmap.c b/mm/mmap.c
index c035020d0c89..1d859893436d 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2229,6 +2229,9 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
*/
pgoff = 0;
get_area = shmem_get_unmapped_area;
+ } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
+ /* Ensures that larger anonymous mappings are THP aligned. */
+ get_area = thp_get_unmapped_area;
}

addr = get_area(file, addr, len, pgoff, flags);
--
2.37.1



2022-08-10 17:47:15

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH v2] mm: align larger anonymous mappings on THP boundaries

On Tue, Aug 9, 2022 at 11:25 AM Rik van Riel <[email protected]> wrote:
>
> Align larger anonymous memory mappings on THP boundaries by
> going through thp_get_unmapped_area if THPs are enabled for
> the current process.
>
> With this patch, larger anonymous mappings are now THP aligned.
> When a malloc library allocates a 2MB or larger arena, that
> arena can now be mapped with THPs right from the start, which
> can result in better TLB hit rates and execution time.
>
> Signed-off-by: Rik van Riel <[email protected]>
> ---
> v2: avoid the chicken & egg issue with MMF_VM_HUGEPAGE (Yang Shi)

Reviewed-by: Yang Shi <[email protected]>

>
> mm/mmap.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index c035020d0c89..1d859893436d 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2229,6 +2229,9 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
> */
> pgoff = 0;
> get_area = shmem_get_unmapped_area;
> + } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
> + /* Ensures that larger anonymous mappings are THP aligned. */
> + get_area = thp_get_unmapped_area;
> }
>
> addr = get_area(file, addr, len, pgoff, flags);
> --
> 2.37.1
>
>

2024-01-16 11:53:22

by Jiri Slaby

[permalink] [raw]
Subject: Re: [PATCH v2] mm: align larger anonymous mappings on THP boundaries

Hi,

On 09. 08. 22, 20:24, Rik van Riel wrote:
> Align larger anonymous memory mappings on THP boundaries by
> going through thp_get_unmapped_area if THPs are enabled for
> the current process.
>
> With this patch, larger anonymous mappings are now THP aligned.
> When a malloc library allocates a 2MB or larger arena, that
> arena can now be mapped with THPs right from the start, which
> can result in better TLB hit rates and execution time.

This appears to break 32bit processes on x86_64 (at least). In
particular, 32bit kernel or firefox builds in our build system.

Reverting this on top of 6.7 makes it work again.

Downstream report:
https://bugzilla.suse.com/show_bug.cgi?id=1218841

So running:
pahole -J --btf_gen_floats -j --lang_exclude=rust
--skip_encoding_btf_inconsistent_proto --btf_gen_optimized .tmp_vmlinux.btf

crashes or errors out with some random errors:
[182671] STRUCT idr's field 'idr_next' offset=128 bit_size=0 type=181346
Error emitting field

strace shows mmap() fails with ENOMEM right before the errors:
1223 mmap2(NULL, 5783552, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
..
1223 <... mmap2 resumed>) = -1 ENOMEM (Cannot allocate memory)

Note the .tmp_vmlinux.btf above can be arbitrary, but likely large
enough. For reference, one is available at:
https://decibel.fi.muni.cz/~xslaby/n/btf

Any ideas?

> Signed-off-by: Rik van Riel <[email protected]>
> ---
> v2: avoid the chicken & egg issue with MMF_VM_HUGEPAGE (Yang Shi)
>
> mm/mmap.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index c035020d0c89..1d859893436d 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2229,6 +2229,9 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len,
> */
> pgoff = 0;
> get_area = shmem_get_unmapped_area;
> + } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
> + /* Ensures that larger anonymous mappings are THP aligned. */
> + get_area = thp_get_unmapped_area;
> }
>
> addr = get_area(file, addr, len, pgoff, flags);

thanks,
--
js
suse labs


2024-01-16 12:09:49

by Jiri Slaby

[permalink] [raw]
Subject: Re: [PATCH v2] mm: align larger anonymous mappings on THP boundaries

On 16. 01. 24, 12:53, Jiri Slaby wrote:
> Hi,
>
> On 09. 08. 22, 20:24, Rik van Riel wrote:
>> Align larger anonymous memory mappings on THP boundaries by
>> going through thp_get_unmapped_area if THPs are enabled for
>> the current process.
>>
>> With this patch, larger anonymous mappings are now THP aligned.
>> When a malloc library allocates a 2MB or larger arena, that
>> arena can now be mapped with THPs right from the start, which
>> can result in better TLB hit rates and execution time.
>
> This appears to break 32bit processes on x86_64 (at least). In
> particular, 32bit kernel or firefox builds in our build system.
>
> Reverting this on top of 6.7 makes it work again.
>
> Downstream report:
>  https://bugzilla.suse.com/show_bug.cgi?id=1218841
>
> So running:
> pahole -J --btf_gen_floats -j --lang_exclude=rust
> --skip_encoding_btf_inconsistent_proto --btf_gen_optimized .tmp_vmlinux.btf
>
> crashes or errors out with some random errors:
> [182671] STRUCT idr's field 'idr_next' offset=128 bit_size=0 type=181346
> Error emitting field
>
> strace shows mmap() fails with ENOMEM right before the errors:
> 1223  mmap2(NULL, 5783552, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
> ...
> 1223  <... mmap2 resumed>)              = -1 ENOMEM (Cannot allocate
> memory)
>
> Note the .tmp_vmlinux.btf above can be arbitrary, but likely large
> enough. For reference, one is available at:
> https://decibel.fi.muni.cz/~xslaby/n/btf
>
> Any ideas?

This works around the problem, of course (but is a band-aid, not a fix):

--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1829,7 +1829,7 @@ get_unmapped_area(struct file *file, unsigned long
addr, unsigned long len,
*/
pgoff = 0;
get_area = shmem_get_unmapped_area;
- } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
+ } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
!in_32bit_syscall()) {
/* Ensures that larger anonymous mappings are THP
aligned. */
get_area = thp_get_unmapped_area;
}


thp_get_unmapped_area() does not take care of the legacy stuff...

regards,
--
js
suse labs


2024-01-16 22:21:51

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH v2] mm: align larger anonymous mappings on THP boundaries

On Tue, Jan 16, 2024 at 11:16 AM Suren Baghdasaryan <[email protected]> wrote:
>
> On Tue, Jan 16, 2024 at 4:09 AM Jiri Slaby <[email protected]> wrote:
> >
> > On 16. 01. 24, 12:53, Jiri Slaby wrote:
> > > Hi,
> > >
> > > On 09. 08. 22, 20:24, Rik van Riel wrote:
> > >> Align larger anonymous memory mappings on THP boundaries by
> > >> going through thp_get_unmapped_area if THPs are enabled for
> > >> the current process.
> > >>
> > >> With this patch, larger anonymous mappings are now THP aligned.
> > >> When a malloc library allocates a 2MB or larger arena, that
> > >> arena can now be mapped with THPs right from the start, which
> > >> can result in better TLB hit rates and execution time.
> > >
> > > This appears to break 32bit processes on x86_64 (at least). In
> > > particular, 32bit kernel or firefox builds in our build system.
> > >
> > > Reverting this on top of 6.7 makes it work again.
> > >
> > > Downstream report:
> > > https://bugzilla.suse.com/show_bug.cgi?id=1218841
> > >
> > > So running:
> > > pahole -J --btf_gen_floats -j --lang_exclude=rust
> > > --skip_encoding_btf_inconsistent_proto --btf_gen_optimized .tmp_vmlinux.btf
> > >
> > > crashes or errors out with some random errors:
> > > [182671] STRUCT idr's field 'idr_next' offset=128 bit_size=0 type=181346
> > > Error emitting field
> > >
> > > strace shows mmap() fails with ENOMEM right before the errors:
> > > 1223 mmap2(NULL, 5783552, PROT_READ|PROT_WRITE,
> > > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
> > > ...
> > > 1223 <... mmap2 resumed>) = -1 ENOMEM (Cannot allocate
> > > memory)
> > >
> > > Note the .tmp_vmlinux.btf above can be arbitrary, but likely large
> > > enough. For reference, one is available at:
> > > https://decibel.fi.muni.cz/~xslaby/n/btf
> > >
> > > Any ideas?
> >
> > This works around the problem, of course (but is a band-aid, not a fix):
> >
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -1829,7 +1829,7 @@ get_unmapped_area(struct file *file, unsigned long
> > addr, unsigned long len,
> > */
> > pgoff = 0;
> > get_area = shmem_get_unmapped_area;
> > - } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
> > + } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
> > !in_32bit_syscall()) {
> > /* Ensures that larger anonymous mappings are THP
> > aligned. */
> > get_area = thp_get_unmapped_area;
> > }
> >
> >
> > thp_get_unmapped_area() does not take care of the legacy stuff...
>
> This change also affects the entropy of allocations. With this patch
> Android test [1] started failing and it requires only 8 bits of
> entropy. The feedback from our security team:
>
> 8 bits of entropy is already embarrassingly low, but was necessary for
> 32 bit ARM targets with low RAM at the time. It's definitely not
> acceptable for 64 bit targets.

Thanks for the report. Is it 32 bit only or 64 bit is also impacted?
If I understand the code correctly, it expects the address allocated
by malloc() is kind of randomized, right?

>
> Could this change be either reverted or made optional (opt-in/opt-out)?
> Thanks,
> Suren.
>
> [1] https://cs.android.com/android/platform/superproject/main/+/main:cts/tests/aslr/src/AslrMallocTest.cpp;l=130
>
> >
> > regards,
> > --
> > js
> > suse labs
> >
> >

2024-01-16 22:34:36

by Suren Baghdasaryan

[permalink] [raw]
Subject: Re: [PATCH v2] mm: align larger anonymous mappings on THP boundaries

On Tue, Jan 16, 2024 at 12:56 PM Yang Shi <[email protected]> wrote:
>
> On Tue, Jan 16, 2024 at 11:16 AM Suren Baghdasaryan <[email protected]> wrote:
> >
> > On Tue, Jan 16, 2024 at 4:09 AM Jiri Slaby <[email protected]> wrote:
> > >
> > > On 16. 01. 24, 12:53, Jiri Slaby wrote:
> > > > Hi,
> > > >
> > > > On 09. 08. 22, 20:24, Rik van Riel wrote:
> > > >> Align larger anonymous memory mappings on THP boundaries by
> > > >> going through thp_get_unmapped_area if THPs are enabled for
> > > >> the current process.
> > > >>
> > > >> With this patch, larger anonymous mappings are now THP aligned.
> > > >> When a malloc library allocates a 2MB or larger arena, that
> > > >> arena can now be mapped with THPs right from the start, which
> > > >> can result in better TLB hit rates and execution time.
> > > >
> > > > This appears to break 32bit processes on x86_64 (at least). In
> > > > particular, 32bit kernel or firefox builds in our build system.
> > > >
> > > > Reverting this on top of 6.7 makes it work again.
> > > >
> > > > Downstream report:
> > > > https://bugzilla.suse.com/show_bug.cgi?id=1218841
> > > >
> > > > So running:
> > > > pahole -J --btf_gen_floats -j --lang_exclude=rust
> > > > --skip_encoding_btf_inconsistent_proto --btf_gen_optimized .tmp_vmlinux.btf
> > > >
> > > > crashes or errors out with some random errors:
> > > > [182671] STRUCT idr's field 'idr_next' offset=128 bit_size=0 type=181346
> > > > Error emitting field
> > > >
> > > > strace shows mmap() fails with ENOMEM right before the errors:
> > > > 1223 mmap2(NULL, 5783552, PROT_READ|PROT_WRITE,
> > > > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
> > > > ...
> > > > 1223 <... mmap2 resumed>) = -1 ENOMEM (Cannot allocate
> > > > memory)
> > > >
> > > > Note the .tmp_vmlinux.btf above can be arbitrary, but likely large
> > > > enough. For reference, one is available at:
> > > > https://decibel.fi.muni.cz/~xslaby/n/btf
> > > >
> > > > Any ideas?
> > >
> > > This works around the problem, of course (but is a band-aid, not a fix):
> > >
> > > --- a/mm/mmap.c
> > > +++ b/mm/mmap.c
> > > @@ -1829,7 +1829,7 @@ get_unmapped_area(struct file *file, unsigned long
> > > addr, unsigned long len,
> > > */
> > > pgoff = 0;
> > > get_area = shmem_get_unmapped_area;
> > > - } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
> > > + } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
> > > !in_32bit_syscall()) {
> > > /* Ensures that larger anonymous mappings are THP
> > > aligned. */
> > > get_area = thp_get_unmapped_area;
> > > }
> > >
> > >
> > > thp_get_unmapped_area() does not take care of the legacy stuff...
> >
> > This change also affects the entropy of allocations. With this patch
> > Android test [1] started failing and it requires only 8 bits of
> > entropy. The feedback from our security team:
> >
> > 8 bits of entropy is already embarrassingly low, but was necessary for
> > 32 bit ARM targets with low RAM at the time. It's definitely not
> > acceptable for 64 bit targets.
>
> Thanks for the report. Is it 32 bit only or 64 bit is also impacted?
> If I understand the code correctly, it expects the address allocated
> by malloc() is kind of randomized, right?

Yes, correct, the test expects a certain level of address randomization.
The test failure was reported while running kernel_virt_x86_64 target
(Android emulator on x86), so it does impact 64bit targets.

>
> >
> > Could this change be either reverted or made optional (opt-in/opt-out)?
> > Thanks,
> > Suren.
> >
> > [1] https://cs.android.com/android/platform/superproject/main/+/main:cts/tests/aslr/src/AslrMallocTest.cpp;l=130
> >
> > >
> > > regards,
> > > --
> > > js
> > > suse labs
> > >
> > >

2024-01-16 22:49:39

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH v2] mm: align larger anonymous mappings on THP boundaries

On Tue, Jan 16, 2024 at 1:58 PM Suren Baghdasaryan <[email protected]> wrote:
>
> On Tue, Jan 16, 2024 at 12:56 PM Yang Shi <[email protected]> wrote:
> >
> > On Tue, Jan 16, 2024 at 11:16 AM Suren Baghdasaryan <[email protected]> wrote:
> > >
> > > On Tue, Jan 16, 2024 at 4:09 AM Jiri Slaby <[email protected]> wrote:
> > > >
> > > > On 16. 01. 24, 12:53, Jiri Slaby wrote:
> > > > > Hi,
> > > > >
> > > > > On 09. 08. 22, 20:24, Rik van Riel wrote:
> > > > >> Align larger anonymous memory mappings on THP boundaries by
> > > > >> going through thp_get_unmapped_area if THPs are enabled for
> > > > >> the current process.
> > > > >>
> > > > >> With this patch, larger anonymous mappings are now THP aligned.
> > > > >> When a malloc library allocates a 2MB or larger arena, that
> > > > >> arena can now be mapped with THPs right from the start, which
> > > > >> can result in better TLB hit rates and execution time.
> > > > >
> > > > > This appears to break 32bit processes on x86_64 (at least). In
> > > > > particular, 32bit kernel or firefox builds in our build system.
> > > > >
> > > > > Reverting this on top of 6.7 makes it work again.
> > > > >
> > > > > Downstream report:
> > > > > https://bugzilla.suse.com/show_bug.cgi?id=1218841
> > > > >
> > > > > So running:
> > > > > pahole -J --btf_gen_floats -j --lang_exclude=rust
> > > > > --skip_encoding_btf_inconsistent_proto --btf_gen_optimized .tmp_vmlinux.btf
> > > > >
> > > > > crashes or errors out with some random errors:
> > > > > [182671] STRUCT idr's field 'idr_next' offset=128 bit_size=0 type=181346
> > > > > Error emitting field
> > > > >
> > > > > strace shows mmap() fails with ENOMEM right before the errors:
> > > > > 1223 mmap2(NULL, 5783552, PROT_READ|PROT_WRITE,
> > > > > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
> > > > > ...
> > > > > 1223 <... mmap2 resumed>) = -1 ENOMEM (Cannot allocate
> > > > > memory)
> > > > >
> > > > > Note the .tmp_vmlinux.btf above can be arbitrary, but likely large
> > > > > enough. For reference, one is available at:
> > > > > https://decibel.fi.muni.cz/~xslaby/n/btf
> > > > >
> > > > > Any ideas?
> > > >
> > > > This works around the problem, of course (but is a band-aid, not a fix):
> > > >
> > > > --- a/mm/mmap.c
> > > > +++ b/mm/mmap.c
> > > > @@ -1829,7 +1829,7 @@ get_unmapped_area(struct file *file, unsigned long
> > > > addr, unsigned long len,
> > > > */
> > > > pgoff = 0;
> > > > get_area = shmem_get_unmapped_area;
> > > > - } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
> > > > + } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
> > > > !in_32bit_syscall()) {
> > > > /* Ensures that larger anonymous mappings are THP
> > > > aligned. */
> > > > get_area = thp_get_unmapped_area;
> > > > }
> > > >
> > > >
> > > > thp_get_unmapped_area() does not take care of the legacy stuff...
> > >
> > > This change also affects the entropy of allocations. With this patch
> > > Android test [1] started failing and it requires only 8 bits of
> > > entropy. The feedback from our security team:
> > >
> > > 8 bits of entropy is already embarrassingly low, but was necessary for
> > > 32 bit ARM targets with low RAM at the time. It's definitely not
> > > acceptable for 64 bit targets.
> >
> > Thanks for the report. Is it 32 bit only or 64 bit is also impacted?
> > If I understand the code correctly, it expects the address allocated
> > by malloc() is kind of randomized, right?
>
> Yes, correct, the test expects a certain level of address randomization.
> The test failure was reported while running kernel_virt_x86_64 target
> (Android emulator on x86), so it does impact 64bit targets.

IIUC this breaks the "expectation" for randomized addresses returned
by malloc(), but it doesn't break any real Android application, right?
So this is a security concern instead of a real regression.

I think we can make this opt-in with a knob. Anyone who outweighs
security could opt this feature out. However I'm wondering whether
Android should implement a general address randomization mechanism
instead of depending on "luck" if you do care about it.

>
> >
> > >
> > > Could this change be either reverted or made optional (opt-in/opt-out)?
> > > Thanks,
> > > Suren.
> > >
> > > [1] https://cs.android.com/android/platform/superproject/main/+/main:cts/tests/aslr/src/AslrMallocTest.cpp;l=130
> > >
> > > >
> > > > regards,
> > > > --
> > > > js
> > > > suse labs
> > > >
> > > >

2024-01-17 17:40:25

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH v2] mm: align larger anonymous mappings on THP boundaries

On Tue, Jan 16, 2024 at 02:30:36PM -0800, Suren Baghdasaryan wrote:
> On Tue, Jan 16, 2024 at 2:25 PM Yang Shi <[email protected]> wrote:
> >
> > On Tue, Jan 16, 2024 at 1:58 PM Suren Baghdasaryan <[email protected]> wrote:
> > >
> > > On Tue, Jan 16, 2024 at 12:56 PM Yang Shi <[email protected]> wrote:
> > > >
> > > > On Tue, Jan 16, 2024 at 11:16 AM Suren Baghdasaryan <[email protected]> wrote:
> > > > >
> > > > > On Tue, Jan 16, 2024 at 4:09 AM Jiri Slaby <[email protected]> wrote:
> > > > > >
> > > > > > On 16. 01. 24, 12:53, Jiri Slaby wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > > On 09. 08. 22, 20:24, Rik van Riel wrote:
> > > > > > >> Align larger anonymous memory mappings on THP boundaries by
> > > > > > >> going through thp_get_unmapped_area if THPs are enabled for
> > > > > > >> the current process.
> > > > > > >>
> > > > > > >> With this patch, larger anonymous mappings are now THP aligned.
> > > > > > >> When a malloc library allocates a 2MB or larger arena, that
> > > > > > >> arena can now be mapped with THPs right from the start, which
> > > > > > >> can result in better TLB hit rates and execution time.
> > > > > > >
> > > > > > > This appears to break 32bit processes on x86_64 (at least). In
> > > > > > > particular, 32bit kernel or firefox builds in our build system.
> > > > > > >
> > > > > > > Reverting this on top of 6.7 makes it work again.
> > > > > > >
> > > > > > > Downstream report:
> > > > > > > https://bugzilla.suse.com/show_bug.cgi?id=1218841
> > > > > > >
> > > > > > > So running:
> > > > > > > pahole -J --btf_gen_floats -j --lang_exclude=rust
> > > > > > > --skip_encoding_btf_inconsistent_proto --btf_gen_optimized .tmp_vmlinux.btf
> > > > > > >
> > > > > > > crashes or errors out with some random errors:
> > > > > > > [182671] STRUCT idr's field 'idr_next' offset=128 bit_size=0 type=181346
> > > > > > > Error emitting field
> > > > > > >
> > > > > > > strace shows mmap() fails with ENOMEM right before the errors:
> > > > > > > 1223 mmap2(NULL, 5783552, PROT_READ|PROT_WRITE,
> > > > > > > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
> > > > > > > ...
> > > > > > > 1223 <... mmap2 resumed>) = -1 ENOMEM (Cannot allocate
> > > > > > > memory)
> > > > > > >
> > > > > > > Note the .tmp_vmlinux.btf above can be arbitrary, but likely large
> > > > > > > enough. For reference, one is available at:
> > > > > > > https://decibel.fi.muni.cz/~xslaby/n/btf
> > > > > > >
> > > > > > > Any ideas?
> > > > > >
> > > > > > This works around the problem, of course (but is a band-aid, not a fix):
> > > > > >
> > > > > > --- a/mm/mmap.c
> > > > > > +++ b/mm/mmap.c
> > > > > > @@ -1829,7 +1829,7 @@ get_unmapped_area(struct file *file, unsigned long
> > > > > > addr, unsigned long len,
> > > > > > */
> > > > > > pgoff = 0;
> > > > > > get_area = shmem_get_unmapped_area;
> > > > > > - } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
> > > > > > + } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
> > > > > > !in_32bit_syscall()) {
> > > > > > /* Ensures that larger anonymous mappings are THP
> > > > > > aligned. */
> > > > > > get_area = thp_get_unmapped_area;
> > > > > > }
> > > > > >
> > > > > >
> > > > > > thp_get_unmapped_area() does not take care of the legacy stuff...
> > > > >
> > > > > This change also affects the entropy of allocations. With this patch
> > > > > Android test [1] started failing and it requires only 8 bits of
> > > > > entropy. The feedback from our security team:
> > > > >
> > > > > 8 bits of entropy is already embarrassingly low, but was necessary for
> > > > > 32 bit ARM targets with low RAM at the time. It's definitely not
> > > > > acceptable for 64 bit targets.
> > > >
> > > > Thanks for the report. Is it 32 bit only or 64 bit is also impacted?
> > > > If I understand the code correctly, it expects the address allocated
> > > > by malloc() is kind of randomized, right?
> > >
> > > Yes, correct, the test expects a certain level of address randomization.
> > > The test failure was reported while running kernel_virt_x86_64 target
> > > (Android emulator on x86), so it does impact 64bit targets.
> >
> > IIUC this breaks the "expectation" for randomized addresses returned
> > by malloc(), but it doesn't break any real Android application, right?
> > So this is a security concern instead of a real regression.
>
> How is making a system move vulnerabile not a real regression?
>
> >
> > I think we can make this opt-in with a knob. Anyone who outweighs
> > security could opt this feature out. However I'm wondering whether
> > Android should implement a general address randomization mechanism
> > instead of depending on "luck" if you do care about it.
>
> This is not depending on luck. This is checking for possible
> vulnerabilities in the system.
> I admit I'm not a security expert, so I'm looping in Jeff and Kees to chime in.

Hi!

Just to chime in, but reduction in ASLR entropy is absolutely a
regression. This is userspace visible (via /proc/sys/kernel/randomize_va_space,
/proc/sys/vm/mmap_rnd*_bits) with expectations that it work as
advertised. So, while 32-bit might be already ASLR-weak, we don't want
to make things super bad nor break ASLR in compat mode under 64-bit
systems.

Having an opt-in sounds reasonable, but we need to wire it to the ASLR
sysctls in some way so nothing lying about the ASLR entropy.

-Kees

--
Kees Cook

2024-01-18 00:02:22

by Suren Baghdasaryan

[permalink] [raw]
Subject: Re: [PATCH v2] mm: align larger anonymous mappings on THP boundaries

On Wed, Jan 17, 2024 at 3:32 PM Yang Shi <[email protected]> wrote:
>
> On Wed, Jan 17, 2024 at 9:40 AM Kees Cook <[email protected]> wrote:
> >
> > On Tue, Jan 16, 2024 at 02:30:36PM -0800, Suren Baghdasaryan wrote:
> > > On Tue, Jan 16, 2024 at 2:25 PM Yang Shi <[email protected]> wrote:
> > > >
> > > > On Tue, Jan 16, 2024 at 1:58 PM Suren Baghdasaryan <[email protected]> wrote:
> > > > >
> > > > > On Tue, Jan 16, 2024 at 12:56 PM Yang Shi <[email protected]> wrote:
> > > > > >
> > > > > > On Tue, Jan 16, 2024 at 11:16 AM Suren Baghdasaryan <[email protected]> wrote:
> > > > > > >
> > > > > > > On Tue, Jan 16, 2024 at 4:09 AM Jiri Slaby <[email protected]> wrote:
> > > > > > > >
> > > > > > > > On 16. 01. 24, 12:53, Jiri Slaby wrote:
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > On 09. 08. 22, 20:24, Rik van Riel wrote:
> > > > > > > > >> Align larger anonymous memory mappings on THP boundaries by
> > > > > > > > >> going through thp_get_unmapped_area if THPs are enabled for
> > > > > > > > >> the current process.
> > > > > > > > >>
> > > > > > > > >> With this patch, larger anonymous mappings are now THP aligned.
> > > > > > > > >> When a malloc library allocates a 2MB or larger arena, that
> > > > > > > > >> arena can now be mapped with THPs right from the start, which
> > > > > > > > >> can result in better TLB hit rates and execution time.
> > > > > > > > >
> > > > > > > > > This appears to break 32bit processes on x86_64 (at least). In
> > > > > > > > > particular, 32bit kernel or firefox builds in our build system.
> > > > > > > > >
> > > > > > > > > Reverting this on top of 6.7 makes it work again.
> > > > > > > > >
> > > > > > > > > Downstream report:
> > > > > > > > > https://bugzilla.suse.com/show_bug.cgi?id=1218841
> > > > > > > > >
> > > > > > > > > So running:
> > > > > > > > > pahole -J --btf_gen_floats -j --lang_exclude=rust
> > > > > > > > > --skip_encoding_btf_inconsistent_proto --btf_gen_optimized .tmp_vmlinux.btf
> > > > > > > > >
> > > > > > > > > crashes or errors out with some random errors:
> > > > > > > > > [182671] STRUCT idr's field 'idr_next' offset=128 bit_size=0 type=181346
> > > > > > > > > Error emitting field
> > > > > > > > >
> > > > > > > > > strace shows mmap() fails with ENOMEM right before the errors:
> > > > > > > > > 1223 mmap2(NULL, 5783552, PROT_READ|PROT_WRITE,
> > > > > > > > > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
> > > > > > > > > ...
> > > > > > > > > 1223 <... mmap2 resumed>) = -1 ENOMEM (Cannot allocate
> > > > > > > > > memory)
> > > > > > > > >
> > > > > > > > > Note the .tmp_vmlinux.btf above can be arbitrary, but likely large
> > > > > > > > > enough. For reference, one is available at:
> > > > > > > > > https://decibel.fi.muni.cz/~xslaby/n/btf
> > > > > > > > >
> > > > > > > > > Any ideas?
> > > > > > > >
> > > > > > > > This works around the problem, of course (but is a band-aid, not a fix):
> > > > > > > >
> > > > > > > > --- a/mm/mmap.c
> > > > > > > > +++ b/mm/mmap.c
> > > > > > > > @@ -1829,7 +1829,7 @@ get_unmapped_area(struct file *file, unsigned long
> > > > > > > > addr, unsigned long len,
> > > > > > > > */
> > > > > > > > pgoff = 0;
> > > > > > > > get_area = shmem_get_unmapped_area;
> > > > > > > > - } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
> > > > > > > > + } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
> > > > > > > > !in_32bit_syscall()) {
> > > > > > > > /* Ensures that larger anonymous mappings are THP
> > > > > > > > aligned. */
> > > > > > > > get_area = thp_get_unmapped_area;
> > > > > > > > }
> > > > > > > >
> > > > > > > >
> > > > > > > > thp_get_unmapped_area() does not take care of the legacy stuff...
> > > > > > >
> > > > > > > This change also affects the entropy of allocations. With this patch
> > > > > > > Android test [1] started failing and it requires only 8 bits of
> > > > > > > entropy. The feedback from our security team:
> > > > > > >
> > > > > > > 8 bits of entropy is already embarrassingly low, but was necessary for
> > > > > > > 32 bit ARM targets with low RAM at the time. It's definitely not
> > > > > > > acceptable for 64 bit targets.
> > > > > >
> > > > > > Thanks for the report. Is it 32 bit only or 64 bit is also impacted?
> > > > > > If I understand the code correctly, it expects the address allocated
> > > > > > by malloc() is kind of randomized, right?
> > > > >
> > > > > Yes, correct, the test expects a certain level of address randomization.
> > > > > The test failure was reported while running kernel_virt_x86_64 target
> > > > > (Android emulator on x86), so it does impact 64bit targets.
> > > >
> > > > IIUC this breaks the "expectation" for randomized addresses returned
> > > > by malloc(), but it doesn't break any real Android application, right?
> > > > So this is a security concern instead of a real regression.
> > >
> > > How is making a system move vulnerabile not a real regression?
> > >
> > > >
> > > > I think we can make this opt-in with a knob. Anyone who outweighs
> > > > security could opt this feature out. However I'm wondering whether
> > > > Android should implement a general address randomization mechanism
> > > > instead of depending on "luck" if you do care about it.
> > >
> > > This is not depending on luck. This is checking for possible
> > > vulnerabilities in the system.
> > > I admit I'm not a security expert, so I'm looping in Jeff and Kees to chime in.
> >
> > Hi!
> >
> > Just to chime in, but reduction in ASLR entropy is absolutely a
> > regression. This is userspace visible (via /proc/sys/kernel/randomize_va_space,
> > /proc/sys/vm/mmap_rnd*_bits) with expectations that it work as
> > advertised. So, while 32-bit might be already ASLR-weak, we don't want
> > to make things super bad nor break ASLR in compat mode under 64-bit
> > systems.
> >
> > Having an opt-in sounds reasonable, but we need to wire it to the ASLR
> > sysctls in some way so nothing lying about the ASLR entropy.
>
> Thanks for the explanation. IIUC the randomiza_va_space and
> mmap_rnd_bits randomize the mmap_base and start_brk for each exec()
> call. So the heap allocation is randomized. But it seems the formula
> doesn't take into account huge page. ARM64 adjusts the mmap_rnd_bits
> based on page size.
>
> I did a simple test, which conceptually does:
> 1. call mmap to allocate 8M heap
> 2. print out the allocated address
> 3. run the program 1000 times (launch/exit/re-launch)
> 4. check how many unique addresses
>
> With the default config on my arm64 VM (mmap_rnd_bits is 18), I saw
> 134 unique addresses. Without the patch, I saw 945 unique addresses.
> So I think the test could replicate what your test does.
>
> When I increased the mmap_rnd_bits to 24, I saw 988 unique addresses
> with the patch. x86_64 should have 28 bits by default, it should
> randomize the address quite well. I don't know why you still saw this,
> or you have a different setting for mmap_rnd_bits?

I checked the configuration on our test harness where the test failed:

/proc/sys/vm/mmap_rnd_bits = 32
/proc/sys/vm/mmap_rnd_compat_bits = 16

The failure logs are:

10-20 14:37:52.123 7029 7029 V AslrMallocTest: 7 bits of entropy for
allocation size 8388608 (minimum 8)
10-20 14:37:52.123 7029 7029 E AslrMallocTest: insufficient entropy
for malloc(8388608)

which come from here:
https://cs.android.com/android/platform/superproject/main/+/main:cts/tests/aslr/src/AslrMallocTest.cpp;l=127
So, the allocation size for which this test failed was 2^23.


> I'm wondering whether we should take into account huge page alignment
> for mmap_rnd_bits. And I think this is a huge page common problem, we
> have file mapping huge page aligned as well.
>
> 32 bit is easy, I think I can just make thp_get_unmapped_area() a
> no-op on 32 bit system.
>
> >
> > -Kees
> >
> > --
> > Kees Cook

2024-01-18 00:10:07

by Suren Baghdasaryan

[permalink] [raw]
Subject: Re: [PATCH v2] mm: align larger anonymous mappings on THP boundaries

On Wed, Jan 17, 2024 at 4:07 PM Yang Shi <[email protected]> wrote:
>
> On Tue, Jan 16, 2024 at 4:09 AM Jiri Slaby <[email protected]> wrote:
> >
> > On 16. 01. 24, 12:53, Jiri Slaby wrote:
> > > Hi,
> > >
> > > On 09. 08. 22, 20:24, Rik van Riel wrote:
> > >> Align larger anonymous memory mappings on THP boundaries by
> > >> going through thp_get_unmapped_area if THPs are enabled for
> > >> the current process.
> > >>
> > >> With this patch, larger anonymous mappings are now THP aligned.
> > >> When a malloc library allocates a 2MB or larger arena, that
> > >> arena can now be mapped with THPs right from the start, which
> > >> can result in better TLB hit rates and execution time.
> > >
> > > This appears to break 32bit processes on x86_64 (at least). In
> > > particular, 32bit kernel or firefox builds in our build system.
> > >
> > > Reverting this on top of 6.7 makes it work again.
> > >
> > > Downstream report:
> > > https://bugzilla.suse.com/show_bug.cgi?id=1218841
> > >
> > > So running:
> > > pahole -J --btf_gen_floats -j --lang_exclude=rust
> > > --skip_encoding_btf_inconsistent_proto --btf_gen_optimized .tmp_vmlinux.btf
> > >
> > > crashes or errors out with some random errors:
> > > [182671] STRUCT idr's field 'idr_next' offset=128 bit_size=0 type=181346
> > > Error emitting field
> > >
> > > strace shows mmap() fails with ENOMEM right before the errors:
> > > 1223 mmap2(NULL, 5783552, PROT_READ|PROT_WRITE,
> > > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
> > > ...
> > > 1223 <... mmap2 resumed>) = -1 ENOMEM (Cannot allocate
> > > memory)
> > >
> > > Note the .tmp_vmlinux.btf above can be arbitrary, but likely large
> > > enough. For reference, one is available at:
> > > https://decibel.fi.muni.cz/~xslaby/n/btf
> > >
> > > Any ideas?
> >
> > This works around the problem, of course (but is a band-aid, not a fix):
> >
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -1829,7 +1829,7 @@ get_unmapped_area(struct file *file, unsigned long
> > addr, unsigned long len,
> > */
> > pgoff = 0;
> > get_area = shmem_get_unmapped_area;
> > - } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
> > + } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
> > !in_32bit_syscall()) {
> > /* Ensures that larger anonymous mappings are THP
> > aligned. */
> > get_area = thp_get_unmapped_area;
> > }
> >
> >
> > thp_get_unmapped_area() does not take care of the legacy stuff...
>
> Could you please help test the below patch? It is compiled, but I
> don't have 32 bit userspace or machine to test it.

Hmm. I think you misunderstood me. This is happening on x86_64 emulated system.

>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 94ef5c02b459..a4d0839e5f31 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -811,6 +811,9 @@ static unsigned long
> __thp_get_unmapped_area(struct file *filp,
> loff_t off_align = round_up(off, size);
> unsigned long len_pad, ret;
>
> + if (IS_ENABLED(CONFIG_32BIT) || in_compat_syscall())
> + return 0;
> +
> if (off_end <= off_align || (off_end - off_align) < size)
> return 0;
>
>
> >
> > regards,
> > --
> > js
> > suse labs
> >
>

2024-01-18 00:14:06

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH v2] mm: align larger anonymous mappings on THP boundaries

On Wed, Jan 17, 2024 at 4:02 PM Suren Baghdasaryan <[email protected]> wrote:
>
> On Wed, Jan 17, 2024 at 3:32 PM Yang Shi <[email protected]> wrote:
> >
> > On Wed, Jan 17, 2024 at 9:40 AM Kees Cook <[email protected]> wrote:
> > >
> > > On Tue, Jan 16, 2024 at 02:30:36PM -0800, Suren Baghdasaryan wrote:
> > > > On Tue, Jan 16, 2024 at 2:25 PM Yang Shi <[email protected]> wrote:
> > > > >
> > > > > On Tue, Jan 16, 2024 at 1:58 PM Suren Baghdasaryan <[email protected]> wrote:
> > > > > >
> > > > > > On Tue, Jan 16, 2024 at 12:56 PM Yang Shi <[email protected]> wrote:
> > > > > > >
> > > > > > > On Tue, Jan 16, 2024 at 11:16 AM Suren Baghdasaryan <[email protected]> wrote:
> > > > > > > >
> > > > > > > > On Tue, Jan 16, 2024 at 4:09 AM Jiri Slaby <[email protected]> wrote:
> > > > > > > > >
> > > > > > > > > On 16. 01. 24, 12:53, Jiri Slaby wrote:
> > > > > > > > > > Hi,
> > > > > > > > > >
> > > > > > > > > > On 09. 08. 22, 20:24, Rik van Riel wrote:
> > > > > > > > > >> Align larger anonymous memory mappings on THP boundaries by
> > > > > > > > > >> going through thp_get_unmapped_area if THPs are enabled for
> > > > > > > > > >> the current process.
> > > > > > > > > >>
> > > > > > > > > >> With this patch, larger anonymous mappings are now THP aligned.
> > > > > > > > > >> When a malloc library allocates a 2MB or larger arena, that
> > > > > > > > > >> arena can now be mapped with THPs right from the start, which
> > > > > > > > > >> can result in better TLB hit rates and execution time.
> > > > > > > > > >
> > > > > > > > > > This appears to break 32bit processes on x86_64 (at least). In
> > > > > > > > > > particular, 32bit kernel or firefox builds in our build system.
> > > > > > > > > >
> > > > > > > > > > Reverting this on top of 6.7 makes it work again.
> > > > > > > > > >
> > > > > > > > > > Downstream report:
> > > > > > > > > > https://bugzilla.suse.com/show_bug.cgi?id=1218841
> > > > > > > > > >
> > > > > > > > > > So running:
> > > > > > > > > > pahole -J --btf_gen_floats -j --lang_exclude=rust
> > > > > > > > > > --skip_encoding_btf_inconsistent_proto --btf_gen_optimized .tmp_vmlinux.btf
> > > > > > > > > >
> > > > > > > > > > crashes or errors out with some random errors:
> > > > > > > > > > [182671] STRUCT idr's field 'idr_next' offset=128 bit_size=0 type=181346
> > > > > > > > > > Error emitting field
> > > > > > > > > >
> > > > > > > > > > strace shows mmap() fails with ENOMEM right before the errors:
> > > > > > > > > > 1223 mmap2(NULL, 5783552, PROT_READ|PROT_WRITE,
> > > > > > > > > > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
> > > > > > > > > > ...
> > > > > > > > > > 1223 <... mmap2 resumed>) = -1 ENOMEM (Cannot allocate
> > > > > > > > > > memory)
> > > > > > > > > >
> > > > > > > > > > Note the .tmp_vmlinux.btf above can be arbitrary, but likely large
> > > > > > > > > > enough. For reference, one is available at:
> > > > > > > > > > https://decibel.fi.muni.cz/~xslaby/n/btf
> > > > > > > > > >
> > > > > > > > > > Any ideas?
> > > > > > > > >
> > > > > > > > > This works around the problem, of course (but is a band-aid, not a fix):
> > > > > > > > >
> > > > > > > > > --- a/mm/mmap.c
> > > > > > > > > +++ b/mm/mmap.c
> > > > > > > > > @@ -1829,7 +1829,7 @@ get_unmapped_area(struct file *file, unsigned long
> > > > > > > > > addr, unsigned long len,
> > > > > > > > > */
> > > > > > > > > pgoff = 0;
> > > > > > > > > get_area = shmem_get_unmapped_area;
> > > > > > > > > - } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
> > > > > > > > > + } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
> > > > > > > > > !in_32bit_syscall()) {
> > > > > > > > > /* Ensures that larger anonymous mappings are THP
> > > > > > > > > aligned. */
> > > > > > > > > get_area = thp_get_unmapped_area;
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > thp_get_unmapped_area() does not take care of the legacy stuff...
> > > > > > > >
> > > > > > > > This change also affects the entropy of allocations. With this patch
> > > > > > > > Android test [1] started failing and it requires only 8 bits of
> > > > > > > > entropy. The feedback from our security team:
> > > > > > > >
> > > > > > > > 8 bits of entropy is already embarrassingly low, but was necessary for
> > > > > > > > 32 bit ARM targets with low RAM at the time. It's definitely not
> > > > > > > > acceptable for 64 bit targets.
> > > > > > >
> > > > > > > Thanks for the report. Is it 32 bit only or 64 bit is also impacted?
> > > > > > > If I understand the code correctly, it expects the address allocated
> > > > > > > by malloc() is kind of randomized, right?
> > > > > >
> > > > > > Yes, correct, the test expects a certain level of address randomization.
> > > > > > The test failure was reported while running kernel_virt_x86_64 target
> > > > > > (Android emulator on x86), so it does impact 64bit targets.
> > > > >
> > > > > IIUC this breaks the "expectation" for randomized addresses returned
> > > > > by malloc(), but it doesn't break any real Android application, right?
> > > > > So this is a security concern instead of a real regression.
> > > >
> > > > How is making a system move vulnerabile not a real regression?
> > > >
> > > > >
> > > > > I think we can make this opt-in with a knob. Anyone who outweighs
> > > > > security could opt this feature out. However I'm wondering whether
> > > > > Android should implement a general address randomization mechanism
> > > > > instead of depending on "luck" if you do care about it.
> > > >
> > > > This is not depending on luck. This is checking for possible
> > > > vulnerabilities in the system.
> > > > I admit I'm not a security expert, so I'm looping in Jeff and Kees to chime in.
> > >
> > > Hi!
> > >
> > > Just to chime in, but reduction in ASLR entropy is absolutely a
> > > regression. This is userspace visible (via /proc/sys/kernel/randomize_va_space,
> > > /proc/sys/vm/mmap_rnd*_bits) with expectations that it work as
> > > advertised. So, while 32-bit might be already ASLR-weak, we don't want
> > > to make things super bad nor break ASLR in compat mode under 64-bit
> > > systems.
> > >
> > > Having an opt-in sounds reasonable, but we need to wire it to the ASLR
> > > sysctls in some way so nothing lying about the ASLR entropy.
> >
> > Thanks for the explanation. IIUC the randomiza_va_space and
> > mmap_rnd_bits randomize the mmap_base and start_brk for each exec()
> > call. So the heap allocation is randomized. But it seems the formula
> > doesn't take into account huge page. ARM64 adjusts the mmap_rnd_bits
> > based on page size.
> >
> > I did a simple test, which conceptually does:
> > 1. call mmap to allocate 8M heap
> > 2. print out the allocated address
> > 3. run the program 1000 times (launch/exit/re-launch)
> > 4. check how many unique addresses
> >
> > With the default config on my arm64 VM (mmap_rnd_bits is 18), I saw
> > 134 unique addresses. Without the patch, I saw 945 unique addresses.
> > So I think the test could replicate what your test does.
> >
> > When I increased the mmap_rnd_bits to 24, I saw 988 unique addresses
> > with the patch. x86_64 should have 28 bits by default, it should
> > randomize the address quite well. I don't know why you still saw this,
> > or you have a different setting for mmap_rnd_bits?
>
> I checked the configuration on our test harness where the test failed:

Thanks, Suren.

>
> /proc/sys/vm/mmap_rnd_bits = 32

It is surprising 32 bits still fail. 24 bits on arm64 works for me. Or
the compat bits is used?

> /proc/sys/vm/mmap_rnd_compat_bits = 16
>
> The failure logs are:
>
> 10-20 14:37:52.123 7029 7029 V AslrMallocTest: 7 bits of entropy for
> allocation size 8388608 (minimum 8)
> 10-20 14:37:52.123 7029 7029 E AslrMallocTest: insufficient entropy
> for malloc(8388608)
>
> which come from here:
> https://cs.android.com/android/platform/superproject/main/+/main:cts/tests/aslr/src/AslrMallocTest.cpp;l=127
> So, the allocation size for which this test failed was 2^23.

The patch just tries to align >= 2M allocations. It looks like your
test allocates 256 bytes, 64K and 8M. So just 8M is impacted.


>
>
> > I'm wondering whether we should take into account huge page alignment
> > for mmap_rnd_bits. And I think this is a huge page common problem, we
> > have file mapping huge page aligned as well.
> >
> > 32 bit is easy, I think I can just make thp_get_unmapped_area() a
> > no-op on 32 bit system.
> >
> > >
> > > -Kees
> > >
> > > --
> > > Kees Cook

2024-01-18 00:14:28

by Suren Baghdasaryan

[permalink] [raw]
Subject: Re: [PATCH v2] mm: align larger anonymous mappings on THP boundaries

On Wed, Jan 17, 2024 at 4:09 PM Suren Baghdasaryan <[email protected]> wrote:
>
> On Wed, Jan 17, 2024 at 4:07 PM Yang Shi <[email protected]> wrote:
> >
> > On Tue, Jan 16, 2024 at 4:09 AM Jiri Slaby <[email protected]> wrote:
> > >
> > > On 16. 01. 24, 12:53, Jiri Slaby wrote:
> > > > Hi,
> > > >
> > > > On 09. 08. 22, 20:24, Rik van Riel wrote:
> > > >> Align larger anonymous memory mappings on THP boundaries by
> > > >> going through thp_get_unmapped_area if THPs are enabled for
> > > >> the current process.
> > > >>
> > > >> With this patch, larger anonymous mappings are now THP aligned.
> > > >> When a malloc library allocates a 2MB or larger arena, that
> > > >> arena can now be mapped with THPs right from the start, which
> > > >> can result in better TLB hit rates and execution time.
> > > >
> > > > This appears to break 32bit processes on x86_64 (at least). In
> > > > particular, 32bit kernel or firefox builds in our build system.
> > > >
> > > > Reverting this on top of 6.7 makes it work again.
> > > >
> > > > Downstream report:
> > > > https://bugzilla.suse.com/show_bug.cgi?id=1218841
> > > >
> > > > So running:
> > > > pahole -J --btf_gen_floats -j --lang_exclude=rust
> > > > --skip_encoding_btf_inconsistent_proto --btf_gen_optimized .tmp_vmlinux.btf
> > > >
> > > > crashes or errors out with some random errors:
> > > > [182671] STRUCT idr's field 'idr_next' offset=128 bit_size=0 type=181346
> > > > Error emitting field
> > > >
> > > > strace shows mmap() fails with ENOMEM right before the errors:
> > > > 1223 mmap2(NULL, 5783552, PROT_READ|PROT_WRITE,
> > > > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
> > > > ...
> > > > 1223 <... mmap2 resumed>) = -1 ENOMEM (Cannot allocate
> > > > memory)
> > > >
> > > > Note the .tmp_vmlinux.btf above can be arbitrary, but likely large
> > > > enough. For reference, one is available at:
> > > > https://decibel.fi.muni.cz/~xslaby/n/btf
> > > >
> > > > Any ideas?
> > >
> > > This works around the problem, of course (but is a band-aid, not a fix):
> > >
> > > --- a/mm/mmap.c
> > > +++ b/mm/mmap.c
> > > @@ -1829,7 +1829,7 @@ get_unmapped_area(struct file *file, unsigned long
> > > addr, unsigned long len,
> > > */
> > > pgoff = 0;
> > > get_area = shmem_get_unmapped_area;
> > > - } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
> > > + } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
> > > !in_32bit_syscall()) {
> > > /* Ensures that larger anonymous mappings are THP
> > > aligned. */
> > > get_area = thp_get_unmapped_area;
> > > }
> > >
> > >
> > > thp_get_unmapped_area() does not take care of the legacy stuff...
> >
> > Could you please help test the below patch? It is compiled, but I
> > don't have 32 bit userspace or machine to test it.
>
> Hmm. I think you misunderstood me. This is happening on x86_64 emulated system.

I mean it's x86_64 Android emulator, so kernel is 64bit.

>
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index 94ef5c02b459..a4d0839e5f31 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -811,6 +811,9 @@ static unsigned long
> > __thp_get_unmapped_area(struct file *filp,
> > loff_t off_align = round_up(off, size);
> > unsigned long len_pad, ret;
> >
> > + if (IS_ENABLED(CONFIG_32BIT) || in_compat_syscall())
> > + return 0;
> > +
> > if (off_end <= off_align || (off_end - off_align) < size)
> > return 0;
> >
> >
> > >
> > > regards,
> > > --
> > > js
> > > suse labs
> > >
> >

2024-01-18 01:34:44

by Suren Baghdasaryan

[permalink] [raw]
Subject: Re: [PATCH v2] mm: align larger anonymous mappings on THP boundaries

On Wed, Jan 17, 2024 at 4:29 PM Suren Baghdasaryan <[email protected]> wrote:
>
> On Wed, Jan 17, 2024 at 4:13 PM Yang Shi <[email protected]> wrote:
> >
> > On Wed, Jan 17, 2024 at 4:02 PM Suren Baghdasaryan <[email protected]> wrote:
> > >
> > > On Wed, Jan 17, 2024 at 3:32 PM Yang Shi <[email protected]> wrote:
> > > >
> > > > On Wed, Jan 17, 2024 at 9:40 AM Kees Cook <[email protected]> wrote:
> > > > >
> > > > > On Tue, Jan 16, 2024 at 02:30:36PM -0800, Suren Baghdasaryan wrote:
> > > > > > On Tue, Jan 16, 2024 at 2:25 PM Yang Shi <[email protected]> wrote:
> > > > > > >
> > > > > > > On Tue, Jan 16, 2024 at 1:58 PM Suren Baghdasaryan <[email protected]> wrote:
> > > > > > > >
> > > > > > > > On Tue, Jan 16, 2024 at 12:56 PM Yang Shi <[email protected]> wrote:
> > > > > > > > >
> > > > > > > > > On Tue, Jan 16, 2024 at 11:16 AM Suren Baghdasaryan <[email protected]> wrote:
> > > > > > > > > >
> > > > > > > > > > On Tue, Jan 16, 2024 at 4:09 AM Jiri Slaby <[email protected]> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On 16. 01. 24, 12:53, Jiri Slaby wrote:
> > > > > > > > > > > > Hi,
> > > > > > > > > > > >
> > > > > > > > > > > > On 09. 08. 22, 20:24, Rik van Riel wrote:
> > > > > > > > > > > >> Align larger anonymous memory mappings on THP boundaries by
> > > > > > > > > > > >> going through thp_get_unmapped_area if THPs are enabled for
> > > > > > > > > > > >> the current process.
> > > > > > > > > > > >>
> > > > > > > > > > > >> With this patch, larger anonymous mappings are now THP aligned.
> > > > > > > > > > > >> When a malloc library allocates a 2MB or larger arena, that
> > > > > > > > > > > >> arena can now be mapped with THPs right from the start, which
> > > > > > > > > > > >> can result in better TLB hit rates and execution time.
> > > > > > > > > > > >
> > > > > > > > > > > > This appears to break 32bit processes on x86_64 (at least). In
> > > > > > > > > > > > particular, 32bit kernel or firefox builds in our build system.
> > > > > > > > > > > >
> > > > > > > > > > > > Reverting this on top of 6.7 makes it work again.
> > > > > > > > > > > >
> > > > > > > > > > > > Downstream report:
> > > > > > > > > > > > https://bugzilla.suse.com/show_bug.cgi?id=1218841
> > > > > > > > > > > >
> > > > > > > > > > > > So running:
> > > > > > > > > > > > pahole -J --btf_gen_floats -j --lang_exclude=rust
> > > > > > > > > > > > --skip_encoding_btf_inconsistent_proto --btf_gen_optimized .tmp_vmlinux.btf
> > > > > > > > > > > >
> > > > > > > > > > > > crashes or errors out with some random errors:
> > > > > > > > > > > > [182671] STRUCT idr's field 'idr_next' offset=128 bit_size=0 type=181346
> > > > > > > > > > > > Error emitting field
> > > > > > > > > > > >
> > > > > > > > > > > > strace shows mmap() fails with ENOMEM right before the errors:
> > > > > > > > > > > > 1223 mmap2(NULL, 5783552, PROT_READ|PROT_WRITE,
> > > > > > > > > > > > MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>
> > > > > > > > > > > > ...
> > > > > > > > > > > > 1223 <... mmap2 resumed>) = -1 ENOMEM (Cannot allocate
> > > > > > > > > > > > memory)
> > > > > > > > > > > >
> > > > > > > > > > > > Note the .tmp_vmlinux.btf above can be arbitrary, but likely large
> > > > > > > > > > > > enough. For reference, one is available at:
> > > > > > > > > > > > https://decibel.fi.muni.cz/~xslaby/n/btf
> > > > > > > > > > > >
> > > > > > > > > > > > Any ideas?
> > > > > > > > > > >
> > > > > > > > > > > This works around the problem, of course (but is a band-aid, not a fix):
> > > > > > > > > > >
> > > > > > > > > > > --- a/mm/mmap.c
> > > > > > > > > > > +++ b/mm/mmap.c
> > > > > > > > > > > @@ -1829,7 +1829,7 @@ get_unmapped_area(struct file *file, unsigned long
> > > > > > > > > > > addr, unsigned long len,
> > > > > > > > > > > */
> > > > > > > > > > > pgoff = 0;
> > > > > > > > > > > get_area = shmem_get_unmapped_area;
> > > > > > > > > > > - } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
> > > > > > > > > > > + } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
> > > > > > > > > > > !in_32bit_syscall()) {
> > > > > > > > > > > /* Ensures that larger anonymous mappings are THP
> > > > > > > > > > > aligned. */
> > > > > > > > > > > get_area = thp_get_unmapped_area;
> > > > > > > > > > > }
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > thp_get_unmapped_area() does not take care of the legacy stuff...
> > > > > > > > > >
> > > > > > > > > > This change also affects the entropy of allocations. With this patch
> > > > > > > > > > Android test [1] started failing and it requires only 8 bits of
> > > > > > > > > > entropy. The feedback from our security team:
> > > > > > > > > >
> > > > > > > > > > 8 bits of entropy is already embarrassingly low, but was necessary for
> > > > > > > > > > 32 bit ARM targets with low RAM at the time. It's definitely not
> > > > > > > > > > acceptable for 64 bit targets.
> > > > > > > > >
> > > > > > > > > Thanks for the report. Is it 32 bit only or 64 bit is also impacted?
> > > > > > > > > If I understand the code correctly, it expects the address allocated
> > > > > > > > > by malloc() is kind of randomized, right?
> > > > > > > >
> > > > > > > > Yes, correct, the test expects a certain level of address randomization.
> > > > > > > > The test failure was reported while running kernel_virt_x86_64 target
> > > > > > > > (Android emulator on x86), so it does impact 64bit targets.
> > > > > > >
> > > > > > > IIUC this breaks the "expectation" for randomized addresses returned
> > > > > > > by malloc(), but it doesn't break any real Android application, right?
> > > > > > > So this is a security concern instead of a real regression.
> > > > > >
> > > > > > How is making a system move vulnerabile not a real regression?
> > > > > >
> > > > > > >
> > > > > > > I think we can make this opt-in with a knob. Anyone who outweighs
> > > > > > > security could opt this feature out. However I'm wondering whether
> > > > > > > Android should implement a general address randomization mechanism
> > > > > > > instead of depending on "luck" if you do care about it.
> > > > > >
> > > > > > This is not depending on luck. This is checking for possible
> > > > > > vulnerabilities in the system.
> > > > > > I admit I'm not a security expert, so I'm looping in Jeff and Kees to chime in.
> > > > >
> > > > > Hi!
> > > > >
> > > > > Just to chime in, but reduction in ASLR entropy is absolutely a
> > > > > regression. This is userspace visible (via /proc/sys/kernel/randomize_va_space,
> > > > > /proc/sys/vm/mmap_rnd*_bits) with expectations that it work as
> > > > > advertised. So, while 32-bit might be already ASLR-weak, we don't want
> > > > > to make things super bad nor break ASLR in compat mode under 64-bit
> > > > > systems.
> > > > >
> > > > > Having an opt-in sounds reasonable, but we need to wire it to the ASLR
> > > > > sysctls in some way so nothing lying about the ASLR entropy.
> > > >
> > > > Thanks for the explanation. IIUC the randomiza_va_space and
> > > > mmap_rnd_bits randomize the mmap_base and start_brk for each exec()
> > > > call. So the heap allocation is randomized. But it seems the formula
> > > > doesn't take into account huge page. ARM64 adjusts the mmap_rnd_bits
> > > > based on page size.
> > > >
> > > > I did a simple test, which conceptually does:
> > > > 1. call mmap to allocate 8M heap
> > > > 2. print out the allocated address
> > > > 3. run the program 1000 times (launch/exit/re-launch)
> > > > 4. check how many unique addresses
> > > >
> > > > With the default config on my arm64 VM (mmap_rnd_bits is 18), I saw
> > > > 134 unique addresses. Without the patch, I saw 945 unique addresses.
> > > > So I think the test could replicate what your test does.
> > > >
> > > > When I increased the mmap_rnd_bits to 24, I saw 988 unique addresses
> > > > with the patch. x86_64 should have 28 bits by default, it should
> > > > randomize the address quite well. I don't know why you still saw this,
> > > > or you have a different setting for mmap_rnd_bits?
> > >
> > > I checked the configuration on our test harness where the test failed:
> >
> > Thanks, Suren.
> >
> > >
> > > /proc/sys/vm/mmap_rnd_bits = 32
> >
> > It is surprising 32 bits still fail. 24 bits on arm64 works for me. Or
> > the compat bits is used?
>
> Hmm. Let me verify to exclude that possibility.

Aha! You are correct, the test is using compat syscalls and your
suggestion at https://lore.kernel.org/all/CAHbLzkoL6sCDciHqVMJga288853CHdOTa5thOPQ9SHKSqjGGPQ@mail.gmail.com/
seems to fix it. I started a complete set of presubmit tests at
https://android-review.googlesource.com/c/kernel/common/+/2916065 and
will report the results tomorrow morning but I expect them to pass
now.
Thanks,
Suren.

>
> >
> > > /proc/sys/vm/mmap_rnd_compat_bits = 16
> > >
> > > The failure logs are:
> > >
> > > 10-20 14:37:52.123 7029 7029 V AslrMallocTest: 7 bits of entropy for
> > > allocation size 8388608 (minimum 8)
> > > 10-20 14:37:52.123 7029 7029 E AslrMallocTest: insufficient entropy
> > > for malloc(8388608)
> > >
> > > which come from here:
> > > https://cs.android.com/android/platform/superproject/main/+/main:cts/tests/aslr/src/AslrMallocTest.cpp;l=127
> > > So, the allocation size for which this test failed was 2^23.
> >
> > The patch just tries to align >= 2M allocations. It looks like your
> > test allocates 256 bytes, 64K and 8M. So just 8M is impacted.
>
> Correct.
>
> >
> >
> > >
> > >
> > > > I'm wondering whether we should take into account huge page alignment
> > > > for mmap_rnd_bits. And I think this is a huge page common problem, we
> > > > have file mapping huge page aligned as well.
> > > >
> > > > 32 bit is easy, I think I can just make thp_get_unmapped_area() a
> > > > no-op on 32 bit system.
> > > >
> > > > >
> > > > > -Kees
> > > > >
> > > > > --
> > > > > Kees Cook

2024-01-18 07:06:24

by Jiri Slaby

[permalink] [raw]
Subject: Re: [PATCH v2] mm: align larger anonymous mappings on THP boundaries

On 18. 01. 24, 1:07, Yang Shi wrote:
>> This works around the problem, of course (but is a band-aid, not a fix):
>>
>> --- a/mm/mmap.c
>> +++ b/mm/mmap.c
>> @@ -1829,7 +1829,7 @@ get_unmapped_area(struct file *file, unsigned long
>> addr, unsigned long len,
>> */
>> pgoff = 0;
>> get_area = shmem_get_unmapped_area;
>> - } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
>> + } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
>> !in_32bit_syscall()) {
>> /* Ensures that larger anonymous mappings are THP
>> aligned. */
>> get_area = thp_get_unmapped_area;
>> }
>>
>>
>> thp_get_unmapped_area() does not take care of the legacy stuff...
>
> Could you please help test the below patch? It is compiled, but I
> don't have 32 bit userspace or machine to test it.

Yeah, for x86_64, it's semantically the same as the above, so this works
too:

Tested-by: Jiri Slaby <[email protected]>

> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -811,6 +811,9 @@ static unsigned long
> __thp_get_unmapped_area(struct file *filp,
> loff_t off_align = round_up(off, size);
> unsigned long len_pad, ret;
>
> + if (IS_ENABLED(CONFIG_32BIT) || in_compat_syscall())
> + return 0;
> +
> if (off_end <= off_align || (off_end - off_align) < size)
> return 0;

thanks,
--
js
suse labs


2024-01-18 17:49:04

by Suren Baghdasaryan

[permalink] [raw]
Subject: Re: [PATCH v2] mm: align larger anonymous mappings on THP boundaries

On Wed, Jan 17, 2024 at 11:05 PM Jiri Slaby <[email protected]> wrote:
>
> On 18. 01. 24, 1:07, Yang Shi wrote:
> >> This works around the problem, of course (but is a band-aid, not a fix):
> >>
> >> --- a/mm/mmap.c
> >> +++ b/mm/mmap.c
> >> @@ -1829,7 +1829,7 @@ get_unmapped_area(struct file *file, unsigned long
> >> addr, unsigned long len,
> >> */
> >> pgoff = 0;
> >> get_area = shmem_get_unmapped_area;
> >> - } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
> >> + } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
> >> !in_32bit_syscall()) {
> >> /* Ensures that larger anonymous mappings are THP
> >> aligned. */
> >> get_area = thp_get_unmapped_area;
> >> }
> >>
> >>
> >> thp_get_unmapped_area() does not take care of the legacy stuff...
> >
> > Could you please help test the below patch? It is compiled, but I
> > don't have 32 bit userspace or machine to test it.
>
> Yeah, for x86_64, it's semantically the same as the above, so this works
> too:
>
> Tested-by: Jiri Slaby <[email protected]>

With the addition of #include <linux/compat.h> it builds and passes
our tests as well. Thanks!

Tested-by: Suren Baghdasaryan <[email protected]>

>
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -811,6 +811,9 @@ static unsigned long
> > __thp_get_unmapped_area(struct file *filp,
> > loff_t off_align = round_up(off, size);
> > unsigned long len_pad, ret;
> >
> > + if (IS_ENABLED(CONFIG_32BIT) || in_compat_syscall())
> > + return 0;
> > +
> > if (off_end <= off_align || (off_end - off_align) < size)
> > return 0;
>
> thanks,
> --
> js
> suse labs
>
>

2024-01-18 18:42:39

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH v2] mm: align larger anonymous mappings on THP boundaries

On Thu, Jan 18, 2024 at 9:48 AM Suren Baghdasaryan <[email protected]> wrote:
>
> On Wed, Jan 17, 2024 at 11:05 PM Jiri Slaby <[email protected]> wrote:
> >
> > On 18. 01. 24, 1:07, Yang Shi wrote:
> > >> This works around the problem, of course (but is a band-aid, not a fix):
> > >>
> > >> --- a/mm/mmap.c
> > >> +++ b/mm/mmap.c
> > >> @@ -1829,7 +1829,7 @@ get_unmapped_area(struct file *file, unsigned long
> > >> addr, unsigned long len,
> > >> */
> > >> pgoff = 0;
> > >> get_area = shmem_get_unmapped_area;
> > >> - } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
> > >> + } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
> > >> !in_32bit_syscall()) {
> > >> /* Ensures that larger anonymous mappings are THP
> > >> aligned. */
> > >> get_area = thp_get_unmapped_area;
> > >> }
> > >>
> > >>
> > >> thp_get_unmapped_area() does not take care of the legacy stuff...
> > >
> > > Could you please help test the below patch? It is compiled, but I
> > > don't have 32 bit userspace or machine to test it.
> >
> > Yeah, for x86_64, it's semantically the same as the above, so this works
> > too:
> >
> > Tested-by: Jiri Slaby <[email protected]>
>
> With the addition of #include <linux/compat.h> it builds and passes
> our tests as well. Thanks!
>
> Tested-by: Suren Baghdasaryan <[email protected]>

Thanks for checking the test case and testing the patch. I will fix it
and post the formal patch.

>
> >
> > > --- a/mm/huge_memory.c
> > > +++ b/mm/huge_memory.c
> > > @@ -811,6 +811,9 @@ static unsigned long
> > > __thp_get_unmapped_area(struct file *filp,
> > > loff_t off_align = round_up(off, size);
> > > unsigned long len_pad, ret;
> > >
> > > + if (IS_ENABLED(CONFIG_32BIT) || in_compat_syscall())
> > > + return 0;
> > > +
> > > if (off_end <= off_align || (off_end - off_align) < size)
> > > return 0;
> >
> > thanks,
> > --
> > js
> > suse labs
> >
> >

2024-01-18 18:43:38

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH v2] mm: align larger anonymous mappings on THP boundaries

On Wed, Jan 17, 2024 at 11:04 PM Jiri Slaby <[email protected]> wrote:
>
> On 18. 01. 24, 1:07, Yang Shi wrote:
> >> This works around the problem, of course (but is a band-aid, not a fix):
> >>
> >> --- a/mm/mmap.c
> >> +++ b/mm/mmap.c
> >> @@ -1829,7 +1829,7 @@ get_unmapped_area(struct file *file, unsigned long
> >> addr, unsigned long len,
> >> */
> >> pgoff = 0;
> >> get_area = shmem_get_unmapped_area;
> >> - } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
> >> + } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
> >> !in_32bit_syscall()) {
> >> /* Ensures that larger anonymous mappings are THP
> >> aligned. */
> >> get_area = thp_get_unmapped_area;
> >> }
> >>
> >>
> >> thp_get_unmapped_area() does not take care of the legacy stuff...
> >
> > Could you please help test the below patch? It is compiled, but I
> > don't have 32 bit userspace or machine to test it.
>
> Yeah, for x86_64, it's semantically the same as the above, so this works
> too:
>
> Tested-by: Jiri Slaby <[email protected]>

Thanks!

>
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -811,6 +811,9 @@ static unsigned long
> > __thp_get_unmapped_area(struct file *filp,
> > loff_t off_align = round_up(off, size);
> > unsigned long len_pad, ret;
> >
> > + if (IS_ENABLED(CONFIG_32BIT) || in_compat_syscall())
> > + return 0;
> > +
> > if (off_end <= off_align || (off_end - off_align) < size)
> > return 0;
>
> thanks,
> --
> js
> suse labs
>

2024-01-20 13:43:39

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [PATCH v2] mm: align larger anonymous mappings on THP boundaries

On 16.01.24 12:53, Jiri Slaby wrote:

> On 09. 08. 22, 20:24, Rik van Riel wrote:
>> Align larger anonymous memory mappings on THP boundaries by
>> going through thp_get_unmapped_area if THPs are enabled for
>> the current process.
>>
>> With this patch, larger anonymous mappings are now THP aligned.
>> When a malloc library allocates a 2MB or larger arena, that
>> arena can now be mapped with THPs right from the start, which
>> can result in better TLB hit rates and execution time.
>
> This appears to break 32bit processes on x86_64 (at least). In
> particular, 32bit kernel or firefox builds in our build system.
>
> Reverting this on top of 6.7 makes it work again.
>
> Downstream report:
>  https://bugzilla.suse.com/show_bug.cgi?id=1218841
> [...]
#regzbot ^introduced efa7df3e3bb5
#regzbot title mm: huge_memory: 32 bit systems or compat
userspace broke
#regzbot link: https://bugzilla.suse.com/show_bug.cgi?id=1218841
#regzbot fix: mm: huge_memory: don't force huge page alignment on 32 bit
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.

2024-01-20 15:47:59

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: [PATCH v2] mm: align larger anonymous mappings on THP boundaries

On 16.01.24 12:53, Jiri Slaby wrote:

> On 09. 08. 22, 20:24, Rik van Riel wrote:
>> Align larger anonymous memory mappings on THP boundaries by
>> going through thp_get_unmapped_area if THPs are enabled for
>> the current process.
>>
>> With this patch, larger anonymous mappings are now THP aligned.
>> When a malloc library allocates a 2MB or larger arena, that
>> arena can now be mapped with THPs right from the start, which
>> can result in better TLB hit rates and execution time.
>
> This appears to break 32bit processes on x86_64 (at least). In
> particular, 32bit kernel or firefox builds in our build system.
>
> Reverting this on top of 6.7 makes it work again.
>
> Downstream report:
>  https://bugzilla.suse.com/show_bug.cgi?id=1218841
> [...]

Trying this again, sorry for the noise:

#regzbot ^introduced efa7df3e3bb5
#regzbot title mm: huge_memory: 32 bit systems or compat userspace broke
#regzbot link: https://bugzilla.suse.com/show_bug.cgi?id=1218841
#regzbot fix: mm: huge_memory: don't force huge page alignment on 32 bit
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.