2013-06-07 08:56:52

by Roman Gushchin

[permalink] [raw]
Subject: slub: slab order on multi-processor machines

Hi!

While investigating some compaction-related problems, I noticed, that many (even most)
kernel objects are allocated on slabs with order 2 or 3.

This behavior was introduced by commit 9b2cd506e "slub: Calculate min_objects based on
number of processors." by Christoph Lameter.
As I understand, the idea was to make kernel allocations cheaper by reducing the total
number of page allocations (allocating 1 page with order 3 is cheaper than allocating
8 1-ordered pages).

I'm sure, it's true for recently rebooted machine with a lot of free non-fragmented memory.
But is it also true for heavy-loaded machine with fragmented memory?
Are we sure, that it's cheaper to run compaction and allocate order 3 page than to use
small 1-pages slabs?
Do I miss something?

Disabling this behavior dramatically reduces the number of 2- and 3-ordered allocations.
Compaction is performed significantly rarer. This is especially noticeable on machines
with intensive disk i/o. I do not see any performance degradation. But I'm not sure,
that I'm not missing something.

Any comments and/or ideas are welcomed.

Thanks!

Regards,
Roman


Subject: Re: slub: slab order on multi-processor machines

On Fri, 7 Jun 2013, Roman Gushchin wrote:

> As I understand, the idea was to make kernel allocations cheaper by reducing
> the total
> number of page allocations (allocating 1 page with order 3 is cheaper than
> allocating
> 8 1-ordered pages).

Its also affecting allocator speed. By having less page structures to
manage the metadata effort is reduced. By having more objects in a page
the fastpath of slub is more likely to be used (Visible in allocator
benchmarks). Slub can fall back dynamically to order 0 pages if necessary.
So it can take opportunistically take advantage of contiguous pages.

> I'm sure, it's true for recently rebooted machine with a lot of free
> non-fragmented memory. But is it also true for heavy-loaded machine with
> fragmented memory? Are we sure, that it's cheaper to run compaction and
> allocate order 3 page than to use small 1-pages slabs? Do I miss
> something?

We do have defragmentation logic and defragmentation passes to address
that. In general the need for larger physical contiguous memory segments
will increase as RAM gets larger and larger. Maybe 2M is the next step but
we will always have to face fragmentation regardless of what the next size
it.

2013-06-07 17:09:40

by Roman Gushchin

[permalink] [raw]
Subject: Re: slub: slab order on multi-processor machines

On 07.06.2013 18:12, Christoph Lameter wrote:
> On Fri, 7 Jun 2013, Roman Gushchin wrote:
>
>> As I understand, the idea was to make kernel allocations cheaper by reducing
>> the total
>> number of page allocations (allocating 1 page with order 3 is cheaper than
>> allocating
>> 8 1-ordered pages).
>
> Its also affecting allocator speed. By having less page structures to
> manage the metadata effort is reduced. By having more objects in a page
> the fastpath of slub is more likely to be used (Visible in allocator
> benchmarks). Slub can fall back dynamically to order 0 pages if necessary.
> So it can take opportunistically take advantage of contiguous pages.

Thank you for clarification!

May be it's reasonable to fall back to order 0 pages if it's not possible
to allocate new large page without direct compaction?
I'll try to perform some tests here.

Regards,
Roman