2023-06-27 13:37:37

by Julian Pidancet

[permalink] [raw]
Subject: [PATCH] mm/slub: disable slab merging in the default configuration

Make CONFIG_SLAB_MERGE_DEFAULT default to n unless CONFIG_SLUB_TINY is
enabled. Benefits of slab merging is limited on systems that are not
memory constrained: the overhead is negligible and evidence of its
effect on cache hotness is hard to come by.

On the other hand, distinguishing allocations into different slabs will
make attacks that rely on "heap spraying" more difficult to carry out
with success.

Take sides with security in the default kernel configuration over
questionnable performance benefits/memory efficiency.

Signed-off-by: Julian Pidancet <[email protected]>
---
In an attempt to assess the performance impact of disabling slab
merging, a timed linux kernel compilation test has been conducted first
using slab_merge, then using slab_nomerge. Both tests started in an
identical state. Commodity hardware was used: a laptop with an AMD Ryzen
5 3500U CPU, and 16GiB of RAM. The kernel source files were placed on
an XFS partition because of the extensive use of slab caches in XFS.

The results are as follows:

| slab_merge | slab_nomerge |
------+------------------+------------------|
Time | 489.074 ± 10.334 | 489.975 ± 10.350 |
Min | 459.688 | 460.554 |
Max | 493.126 | 494.282 |

The benchmark favors the configuration where merging is disabled, but the
difference is only ~0.18%, well under statistical significance.

.../admin-guide/kernel-parameters.txt | 29 ++++++++++---------
Documentation/mm/slub.rst | 5 ++--
mm/Kconfig | 6 ++--
3 files changed, 21 insertions(+), 19 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 9e5bab29685f..5fbf6ed3c62e 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -5652,21 +5652,22 @@

slram= [HW,MTD]

- slab_merge [MM]
- Enable merging of slabs with similar size when the
- kernel is built without CONFIG_SLAB_MERGE_DEFAULT.
-
slab_nomerge [MM]
- Disable merging of slabs with similar size. May be
- necessary if there is some reason to distinguish
- allocs to different slabs, especially in hardened
- environments where the risk of heap overflows and
- layout control by attackers can usually be
- frustrated by disabling merging. This will reduce
- most of the exposure of a heap attack to a single
- cache (risks via metadata attacks are mostly
- unchanged). Debug options disable merging on their
- own.
+ Disable merging of slabs with similar size when
+ the kernel is built with CONFIG_SLAB_MERGE_DEFAULT.
+ Allocations of the same size made in distinct
+ caches will be placed in separate slabs. In
+ hardened environment, the risk of heap overflows
+ and layout control by attackers can usually be
+ frustrated by disabling merging.
+
+ slab_merge [MM]
+ Enable merging of slabs with similar size. May be
+ necessary to reduce overhead or increase cache
+ hotness of objects, at the cost of increased
+ exposure in case of a heap attack to a single
+ cache. (risks via metadata attacks are mostly
+ unchanged).
For more information see Documentation/mm/slub.rst.

slab_max_order= [MM, SLAB]
diff --git a/Documentation/mm/slub.rst b/Documentation/mm/slub.rst
index be75971532f5..e2549f4a98dd 100644
--- a/Documentation/mm/slub.rst
+++ b/Documentation/mm/slub.rst
@@ -122,8 +122,9 @@ used on the wrong slab.
Slab merging
============

-If no debug options are specified then SLUB may merge similar slabs together
-in order to reduce overhead and increase cache hotness of objects.
+If the kernel is built with ``CONFIG_SLAB_MEGE_DEFAULT`` or if ``slab_merge``
+is specified on the kernel command line, then SLUB may merge similar slabs
+together in order to reduce overhead and increase cache hotness of objects.
``slabinfo -a`` displays which slabs were merged together.

Slab validation
diff --git a/mm/Kconfig b/mm/Kconfig
index 7672a22647b4..05b0304302d4 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -255,7 +255,7 @@ config SLUB_TINY

config SLAB_MERGE_DEFAULT
bool "Allow slab caches to be merged"
- default y
+ default n
depends on SLAB || SLUB
help
For reduced kernel memory fragmentation, slab caches can be
@@ -264,8 +264,8 @@ config SLAB_MERGE_DEFAULT
overwrite objects from merged caches (and more easily control
cache layout), which makes such heap attacks easier to exploit
by attackers. By keeping caches unmerged, these kinds of exploits
- can usually only damage objects in the same cache. To disable
- merging at runtime, "slab_nomerge" can be passed on the kernel
+ can usually only damage objects in the same cache. To enable
+ merging at runtime, "slab_merge" can be passed on the kernel
command line.

config SLAB_FREELIST_RANDOM
--
2.40.1



2023-06-27 19:57:36

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH] mm/slub: disable slab merging in the default configuration

On Tue, 27 Jun 2023, Julian Pidancet wrote:

> Make CONFIG_SLAB_MERGE_DEFAULT default to n unless CONFIG_SLUB_TINY is
> enabled. Benefits of slab merging is limited on systems that are not
> memory constrained: the overhead is negligible and evidence of its
> effect on cache hotness is hard to come by.
>

I don't have an objection to this, I think it makes sense.

When you say overhead here, I assume you're referring to memory footprint?
Did you happen to have some system-wide numbers for what that looks like
when running some benchmarks, or even what the slab usage looks like after
a fresh boot?

> On the other hand, distinguishing allocations into different slabs will
> make attacks that rely on "heap spraying" more difficult to carry out
> with success.
>
> Take sides with security in the default kernel configuration over
> questionnable performance benefits/memory efficiency.
>
> Signed-off-by: Julian Pidancet <[email protected]>
> ---
> In an attempt to assess the performance impact of disabling slab
> merging, a timed linux kernel compilation test has been conducted first
> using slab_merge, then using slab_nomerge. Both tests started in an
> identical state. Commodity hardware was used: a laptop with an AMD Ryzen
> 5 3500U CPU, and 16GiB of RAM. The kernel source files were placed on
> an XFS partition because of the extensive use of slab caches in XFS.
>
> The results are as follows:
>
> | slab_merge | slab_nomerge |
> ------+------------------+------------------|
> Time | 489.074 ± 10.334 | 489.975 ± 10.350 |
> Min | 459.688 | 460.554 |
> Max | 493.126 | 494.282 |
>
> The benchmark favors the configuration where merging is disabled, but the
> difference is only ~0.18%, well under statistical significance.
>

I think this data should be in the changelog itself, as well as any
numbers to share on the memory footprint differences.

> .../admin-guide/kernel-parameters.txt | 29 ++++++++++---------
> Documentation/mm/slub.rst | 5 ++--
> mm/Kconfig | 6 ++--
> 3 files changed, 21 insertions(+), 19 deletions(-)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 9e5bab29685f..5fbf6ed3c62e 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -5652,21 +5652,22 @@
>
> slram= [HW,MTD]
>
> - slab_merge [MM]
> - Enable merging of slabs with similar size when the
> - kernel is built without CONFIG_SLAB_MERGE_DEFAULT.
> -
> slab_nomerge [MM]
> - Disable merging of slabs with similar size. May be
> - necessary if there is some reason to distinguish
> - allocs to different slabs, especially in hardened
> - environments where the risk of heap overflows and
> - layout control by attackers can usually be
> - frustrated by disabling merging. This will reduce
> - most of the exposure of a heap attack to a single
> - cache (risks via metadata attacks are mostly
> - unchanged). Debug options disable merging on their
> - own.
> + Disable merging of slabs with similar size when
> + the kernel is built with CONFIG_SLAB_MERGE_DEFAULT.
> + Allocations of the same size made in distinct
> + caches will be placed in separate slabs. In
> + hardened environment, the risk of heap overflows
> + and layout control by attackers can usually be
> + frustrated by disabling merging.
> +
> + slab_merge [MM]
> + Enable merging of slabs with similar size. May be
> + necessary to reduce overhead or increase cache
> + hotness of objects, at the cost of increased
> + exposure in case of a heap attack to a single
> + cache. (risks via metadata attacks are mostly
> + unchanged).
> For more information see Documentation/mm/slub.rst.
>
> slab_max_order= [MM, SLAB]
> diff --git a/Documentation/mm/slub.rst b/Documentation/mm/slub.rst
> index be75971532f5..e2549f4a98dd 100644
> --- a/Documentation/mm/slub.rst
> +++ b/Documentation/mm/slub.rst
> @@ -122,8 +122,9 @@ used on the wrong slab.
> Slab merging
> ============
>
> -If no debug options are specified then SLUB may merge similar slabs together
> -in order to reduce overhead and increase cache hotness of objects.
> +If the kernel is built with ``CONFIG_SLAB_MEGE_DEFAULT`` or if ``slab_merge``

s/MEGE/MERGE/

> +is specified on the kernel command line, then SLUB may merge similar slabs
> +together in order to reduce overhead and increase cache hotness of objects.

Specify that this is memory overhead?

> ``slabinfo -a`` displays which slabs were merged together.
>
> Slab validation
> diff --git a/mm/Kconfig b/mm/Kconfig
> index 7672a22647b4..05b0304302d4 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -255,7 +255,7 @@ config SLUB_TINY
>
> config SLAB_MERGE_DEFAULT
> bool "Allow slab caches to be merged"
> - default y
> + default n
> depends on SLAB || SLUB
> help
> For reduced kernel memory fragmentation, slab caches can be
> @@ -264,8 +264,8 @@ config SLAB_MERGE_DEFAULT
> overwrite objects from merged caches (and more easily control
> cache layout), which makes such heap attacks easier to exploit
> by attackers. By keeping caches unmerged, these kinds of exploits
> - can usually only damage objects in the same cache. To disable
> - merging at runtime, "slab_nomerge" can be passed on the kernel
> + can usually only damage objects in the same cache. To enable
> + merging at runtime, "slab_merge" can be passed on the kernel
> command line.
>
> config SLAB_FREELIST_RANDOM
> --
> 2.40.1
>
>

2023-06-28 15:22:49

by Julian Pidancet

[permalink] [raw]
Subject: Re: [PATCH] mm/slub: disable slab merging in the default configuration

On Tue Jun 27, 2023 at 21:32, David Rientjes wrote:
> On Tue, 27 Jun 2023, Julian Pidancet wrote:
>
> > Make CONFIG_SLAB_MERGE_DEFAULT default to n unless CONFIG_SLUB_TINY is
> > enabled. Benefits of slab merging is limited on systems that are not
> > memory constrained: the overhead is negligible and evidence of its
> > effect on cache hotness is hard to come by.
> >
>
> I don't have an objection to this, I think it makes sense.
>
> When you say overhead here, I assume you're referring to memory footprint?
> Did you happen to have some system-wide numbers for what that looks like
> when running some benchmarks, or even what the slab usage looks like after
> a fresh boot?
>

Thank you David for the quick review. I'll re-run the benchmark and
measure slab usage when the system is under pressure.

Regards,

--
Julian


Attachments:
signature.asc (273.00 B)

2023-06-28 17:28:36

by Roman Gushchin

[permalink] [raw]
Subject: Re: [PATCH] mm/slub: disable slab merging in the default configuration

On Tue, Jun 27, 2023 at 12:32:15PM -0700, David Rientjes wrote:
> On Tue, 27 Jun 2023, Julian Pidancet wrote:
>
> > Make CONFIG_SLAB_MERGE_DEFAULT default to n unless CONFIG_SLUB_TINY is
> > enabled. Benefits of slab merging is limited on systems that are not
> > memory constrained: the overhead is negligible and evidence of its
> > effect on cache hotness is hard to come by.
> >
>
> I don't have an objection to this, I think it makes sense.

+1

I believe the overhead was much larger when we had per-memcg slab caches,
but now it should be fairly small on most systems.

But I wonder if we need a new flag (SLAB_MERGE?) to explicitly force merging
on per-slab cache basis. I believe there are some cases when slab caches can
be created in noticeable numbers and in those cases the memory footprint might
be noticeable.

Thanks!

2023-06-28 18:53:46

by Lameter, Christopher

[permalink] [raw]
Subject: Re: [PATCH] mm/slub: disable slab merging in the default configuration

On Wed, 28 Jun 2023, Roman Gushchin wrote:

> But I wonder if we need a new flag (SLAB_MERGE?) to explicitly force merging
> on per-slab cache basis. I believe there are some cases when slab caches can
> be created in noticeable numbers and in those cases the memory footprint might
> be noticeable.

One of the uses for merging are the kmalloc like slab cache arrays
created by various subsystem for their allocations. This is particularly
useful for small frequently used caches that seem to have similar
configurations. See slabinfo output below.

Also you are doing the tests on a 4k page systems. We prefer 64k page
sized systems here where the waste due to duplication of structures is
higher. Also the move on x86 is to go to higher page sizes (see the
folio approach by Matthew Wilcox) and this approach would increase the
memory footprint if large folio sizes are used.

Moreover our system here is sensitive to cpu cache pressure given our high
core count which will be caused by the increased cache footprint due to
not merging caches if this patch is accepted.


Here are the aliases on my Ampere Altra system:

root@eng08sys-r113:/home/cl/linux/tools/mm# ./slabinfo -a

:0000024 <- audit_buffer lsm_file_cache
:0000032 <- sd_ext_cdb ext4_io_end_vec fsnotify_mark_connector lsm_inode_cache xfs_refc_intent numa_policy
:0000040 <- xfs_extfree_intent ext4_system_zone
:0000048 <- Acpi-Namespace shared_policy_node xfs_log_ticket xfs_ifork ext4_bio_post_read_ctx ksm_mm_slot
:0000056 <- ftrace_event_field Acpi-Parse xfs_defer_pending file_lock_ctx
:0000064 <- fanotify_path_event ksm_stable_node xfs_rmap_intent jbd2_inode ksm_rmap_item io dmaengine-unmap-2 zswap_entry xfs_bmap_intent iommu_iova vmap_area
:0000072 <- fanotify_perm_event fanotify_fid_event Acpi-Operand
:0000080 <- Acpi-ParseExt Acpi-State audit_tree_mark
:0000088 <- xfs_attr_intent trace_event_file configfs_dir_cache kernfs_iattrs_cache blkdev_ioc
:0000128 <- kernfs_node_cache btree_node
:0000176 <- xfs_iul_item xfs_attrd_item xfs_cud_item xfs_bud_item xfs_rud_item
:0000184 <- xfs_icr ip6-frags
:0000192 <- ip6_mrt_cache ip_dst_cache aio_kiocb uid_cache inet_peer_cache bio_integrity_payload ip_mrt_cache dmaengine-unmap-16 skbuff_ext_cache
:0000200 <- xfs_inobt_cur xfs_refcbt_cur ip4-frags


2023-06-28 21:09:17

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] mm/slub: disable slab merging in the default configuration

On Tue, Jun 27, 2023 at 03:21:31PM +0200, Julian Pidancet wrote:
> Make CONFIG_SLAB_MERGE_DEFAULT default to n unless CONFIG_SLUB_TINY is
> enabled. Benefits of slab merging is limited on systems that are not
> memory constrained: the overhead is negligible and evidence of its
> effect on cache hotness is hard to come by.
>
> On the other hand, distinguishing allocations into different slabs will
> make attacks that rely on "heap spraying" more difficult to carry out
> with success.
>
> Take sides with security in the default kernel configuration over
> questionnable performance benefits/memory efficiency.
>
> Signed-off-by: Julian Pidancet <[email protected]>
> ---
> In an attempt to assess the performance impact of disabling slab
> merging, a timed linux kernel compilation test has been conducted first
> using slab_merge, then using slab_nomerge. Both tests started in an
> identical state. Commodity hardware was used: a laptop with an AMD Ryzen
> 5 3500U CPU, and 16GiB of RAM. The kernel source files were placed on
> an XFS partition because of the extensive use of slab caches in XFS.
>
> The results are as follows:
>
> | slab_merge | slab_nomerge |
> ------+------------------+------------------|
> Time | 489.074 ? 10.334 | 489.975 ? 10.350 |
> Min | 459.688 | 460.554 |
> Max | 493.126 | 494.282 |
>
> The benchmark favors the configuration where merging is disabled, but the
> difference is only ~0.18%, well under statistical significance.

As mentioned, please include these kinds of perf notes in the commit
log; it's useful to see later. :)

Regardless, yes, please. I have been running slab_nomerge on all my
systems for years and years now.

With the typo fixed and commit log updated, please consider this:

Reviewed-by: Kees Cook <[email protected]>

-Kees

--
Kees Cook

2023-06-29 07:35:31

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH] mm/slub: disable slab merging in the default configuration

On 6/28/23 18:44, Roman Gushchin wrote:
> On Tue, Jun 27, 2023 at 12:32:15PM -0700, David Rientjes wrote:
>> On Tue, 27 Jun 2023, Julian Pidancet wrote:
>>
>> > Make CONFIG_SLAB_MERGE_DEFAULT default to n unless CONFIG_SLUB_TINY is
>> > enabled. Benefits of slab merging is limited on systems that are not
>> > memory constrained: the overhead is negligible and evidence of its
>> > effect on cache hotness is hard to come by.
>> >
>>
>> I don't have an objection to this, I think it makes sense.
>
> +1
>
> I believe the overhead was much larger when we had per-memcg slab caches,
> but now it should be fairly small on most systems.
>
> But I wonder if we need a new flag (SLAB_MERGE?) to explicitly force merging
> on per-slab cache basis.

Damn, we just tried to add SLAB_NO_MERGE, that is if Linus pulls the PR, as
I've just found out that the last time he hated the idea [1] :) (but at the
same time I think the current attempt is very different in that it's not
coming via a random tree, and the comments make it clear that it's not for
everyone to enable in production configs just because they think they are
special).

But SLAB_MERGE, I doubt it would get many users being opt-in. People would
have to consciously opt-in to not being special.

As for changing the default, we definitely need to see the memory usage
results first, as was mentioned. It's not expected that disabling merging
would decrease performance, so no wonder the test didn't find such decrease,
but the expected downside is really increased memory overhead.

But then again it's just a default and most people would use a distro config
anyway, and neither option seems to be an obvious winner to me? As for the
"security by default" argument, AFAIK we don't enable freelist
hardening/randomization by default, and I thought (not being the expert on
this) the heap spraying attacks concerned mainly generic kmalloc cache users
(see also [2]) and not some specific named caches being merged?

[1]
https://lore.kernel.org/all/CA+55aFyepmdpbg9U2Pvp+aHjKmmGCrTK2ywzqfmaOTMXQasYNw@mail.gmail.com/
[2]
https://lore.kernel.org/all/[email protected]/

> I believe there are some cases when slab caches can
> be created in noticeable numbers and in those cases the memory footprint might
> be noticeable.
>
> Thanks!


2023-07-06 14:33:04

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] mm/slub: disable slab merging in the default configuration

The slab merging has always been bothering me as it makes debugging
things really hard. I agree with the other comments on improving
the commit log, but with that:

Acked-by: Christoph Hellwig <[email protected]>

2023-07-06 18:37:53

by Lameter, Christopher

[permalink] [raw]
Subject: Re: [PATCH] mm/slub: disable slab merging in the default configuration

On Thu, 6 Jul 2023, Christoph Hellwig wrote:

> The slab merging has always been bothering me as it makes debugging
> things really hard. I agree with the other comments on improving
> the commit log, but with that:

Debugging is enabled by specifying "slub_debug" on the kernel command line
which disables merging and also enables checks so that data corruption
does trigger meaningful messages for debugging.

Without that you may see errors coming from the slab subsystem that are
due to data corruption by various subsystems. Without that it will be
difficult to properly attribute errors anyways.


2023-10-12 16:43:55

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] mm/slub: disable slab merging in the default configuration

On Thu, Jun 29, 2023 at 09:21:14AM +0200, Vlastimil Babka wrote:
> On 6/28/23 18:44, Roman Gushchin wrote:
> > On Tue, Jun 27, 2023 at 12:32:15PM -0700, David Rientjes wrote:
> >> On Tue, 27 Jun 2023, Julian Pidancet wrote:
> >>
> >> > Make CONFIG_SLAB_MERGE_DEFAULT default to n unless CONFIG_SLUB_TINY is
> >> > enabled. Benefits of slab merging is limited on systems that are not
> >> > memory constrained: the overhead is negligible and evidence of its
> >> > effect on cache hotness is hard to come by.
> >> >
> >>
> >> I don't have an objection to this, I think it makes sense.
> >
> > +1
> >
> > I believe the overhead was much larger when we had per-memcg slab caches,
> > but now it should be fairly small on most systems.
> >
> > But I wonder if we need a new flag (SLAB_MERGE?) to explicitly force merging
> > on per-slab cache basis.
>
> Damn, we just tried to add SLAB_NO_MERGE, that is if Linus pulls the PR, as
> I've just found out that the last time he hated the idea [1] :) (but at the
> same time I think the current attempt is very different in that it's not
> coming via a random tree, and the comments make it clear that it's not for
> everyone to enable in production configs just because they think they are
> special).
>
> But SLAB_MERGE, I doubt it would get many users being opt-in. People would
> have to consciously opt-in to not being special.
>
> As for changing the default, we definitely need to see the memory usage
> results first, as was mentioned. It's not expected that disabling merging
> would decrease performance, so no wonder the test didn't find such decrease,
> but the expected downside is really increased memory overhead.

Did this analysis happen? Apologies if I missed it...

> But then again it's just a default and most people would use a distro config
> anyway, and neither option seems to be an obvious winner to me? As for the
> "security by default" argument, AFAIK we don't enable freelist
> hardening/randomization by default, and I thought (not being the expert on
> this) the heap spraying attacks concerned mainly generic kmalloc cache users
> (see also [2]) and not some specific named caches being merged?
>
> [1] https://lore.kernel.org/all/CA+55aFyepmdpbg9U2Pvp+aHjKmmGCrTK2ywzqfmaOTMXQasYNw@mail.gmail.com/
> [2] https://lore.kernel.org/all/[email protected]/

I'm a fan of turning on any of these "by default", as that's been the
historical approach, which tends to span years:

- security feature introduced, default off in the kernel
- distros enable it by default
- kernel makes it default on

So perhaps we're better off making the other hardening features on by
default since distros have been shipping with them for years now?

-Kees

--
Kees Cook