LinuxLists.cc - [patch 00/19] Slab Fragmentation Reduction V13

2008-08-01 18:36:22

Subject: [patch 00/19] Slab Fragmentation Reduction V13

V12->v13:
- Rebase onto Linux 2.6.27-rc1 (deal with page flags conversion, ctor parameters etc)
- Fix unitialized variable issue

Slab fragmentation is mainly an issue if Linux is used as a fileserver
and large amounts of dentries, inodes and buffer heads accumulate. In some
load situations the slabs become very sparsely populated so that a lot of
memory is wasted by slabs that only contain one or a few objects. In
extreme cases the performance of a machine will become sluggish since
we are continually running reclaim without much succes.
Slab defragmentation adds the capability to recover the memory that
is wasted.

Memory reclaim for the following slab caches is possible:

1. dentry cache
2. inode cache (with a generic interface to allow easy setup of more
filesystems than the currently supported ext2/3/4 reiserfs, XFS
and proc)
3. buffer_heads

One typical mechanism that triggers slab defragmentation on my systems
is the daily run of

updatedb

Updatedb scans all files on the system which causes a high inode and dentry
use. After updatedb is complete we need to go back to the regular use
patterns (typical on my machine: kernel compiles). Those need the memory now
for different purposes. The inodes and dentries used for updatedb will
gradually be aged by the dentry/inode reclaim algorithm which will free
up the dentries and inode entries randomly through the slabs that were
allocated. As a result the slabs will become sparsely populated. If they
become empty then they can be freed but a lot of them will remain sparsely
populated. That is where slab defrag comes in: It removes the objects from
the slabs with just a few entries reclaiming more memory for other uses.
In the simplest case (as provided here) this is done by simply reclaiming
the objects.

However, if the logic in the kick() function is made more
sophisticated then we will be able to move the objects out of the slabs.
Allocations of objects is possible if a slab is fragmented without the use of
the page allocator because a large number of free slots are available. Moving
an object will reduce fragmentation in the slab the object is moved to.

V11->V12:
- Pekka and me fixed various minor issues pointed out by Andrew.
- Split ext2/3/4 defrag support patches.
- Add more documentation
- Revise the way that slab defrag is triggered from reclaim. No longer
use a timeout but track the amount of slab reclaim done by the shrinkers.
Add a field in /proc/sys/vm/slab_defrag_limit to control the threshold.
- Display current slab_defrag_counters in /proc/zoneinfo (for a zone) and
/proc/sys/vm/slab_defrag_count (for global reclaim).
- Add new config vaue slab_defrag_limit to /proc/sys/vm/slab_defrag_limit
- Add a patch that obsoletes SLAB and explains why SLOB does not support
defrag (Either of those could be theoretically equipped to support
slab defrag in some way but it seems that Andrew/Linus want to reduce
the number of slab allocators).

V10->V11
- Simplify determination when to reclaim: Just scan over all partials
and check if they are sparsely populated.
- Add support for performance counters
- Rediff on top of current slab-mm.
- Reduce frequency of scanning. A look at the stats showed that we
were calling into reclaim very frequently when the system was under
memory pressure which slowed things down. Various measures to
avoid scanning the partial list too frequently were added and the
earlier (expensive) method of determining the defrag ratio of the slab
cache as a whole was dropped. I think this addresses the issues that
Mel saw with V10.

V9->V10
- Rediff against upstream

V8->V9
- Rediff against 2.6.24-rc6-mm1

V7->V8
- Rediff against 2.6.24-rc3-mm2

V6->V7
- Rediff against 2.6.24-rc2-mm1
- Remove lumpy reclaim support. No point anymore given that the antifrag
handling in 2.6.24-rc2 puts reclaimable slabs into different sections.
Targeted reclaim never triggers. This has to wait until we make
slabs movable or we need to perform a special version of lumpy reclaim
in SLUB while we scan the partial lists for slabs to kick out.
Removal simplifies handling significantly since we
get to slabs in a more controlled way via the partial lists.
The patchset now provides pure reduction of fragmentation levels.
- SLAB/SLOB: Provide inlines that do nothing
- Fix various smaller issues that were brought up during review of V6.

V5->V6
- Rediff against 2.6.24-rc2 + mm slub patches.
- Add reviewed by lines.
- Take out the experimental code to make slab pages movable. That
has to wait until this has been considered by Mel.

V4->V5:
- Support lumpy reclaim for slabs
- Support reclaim via slab_shrink()
- Add constructors to insure a consistent object state at all times.

V3->V4:
- Optimize scan for slabs that need defragmentation
- Add /sys/slab/*/defrag_ratio to allow setting defrag limits
per slab.
- Add support for buffer heads.
- Describe how the cleanup after the daily updatedb can be
improved by slab defragmentation.

V2->V3
- Support directory reclaim
- Add infrastructure to trigger defragmentation after slab shrinking if we
have slabs with a high degree of fragmentation.

V1->V2
- Clean up control flow using a state variable. Simplify API. Back to 2
functions that now take arrays of objects.
- Inode defrag support for a set of filesystems
- Fix up dentry defrag support to work on negative dentries by adding
a new dentry flag that indicates that a dentry is not in the process
of being freed or allocated.

--

2008-08-03 01:58:58

by Matthew Wilcox

[permalink] [raw]

Subject: No, really, stop trying to delete slab until you've finished making slub perform as well

On Fri, May 09, 2008 at 07:21:01PM -0700, Christoph Lameter wrote:
> - Add a patch that obsoletes SLAB and explains why SLOB does not support
> defrag (Either of those could be theoretically equipped to support
> slab defrag in some way but it seems that Andrew/Linus want to reduce
> the number of slab allocators).

Do we have to once again explain that slab still outperforms slub on at
least one important benchmark? I hope Nick Piggin finds time to finish
tuning slqb; it already outperforms slub.

--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."

2008-08-03 21:29:19

by Pekka Enberg

[permalink] [raw]

Subject: Re: No, really, stop trying to delete slab until you've finished making slub perform as well

Hi Matthew,

Matthew Wilcox wrote:
> Do we have to once again explain that slab still outperforms slub on at
> least one important benchmark? I hope Nick Piggin finds time to finish
> tuning slqb; it already outperforms slub.

No, you don't have to. I haven't merged that patch nor do I intend to do
so until the regressions are fixed.

And yes, I'm still waiting to hear from you how we're now doing with
higher order page allocations...

Pekka

2008-08-04 02:39:20

by Rene Herman

[permalink] [raw]

Subject: Re: No, really, stop trying to delete slab until you've finished making slub perform as well

On 03-08-08 23:25, Pekka Enberg wrote:

> Matthew Wilcox wrote:

>> Do we have to once again explain that slab still outperforms slub on at
>> least one important benchmark? I hope Nick Piggin finds time to finish
>> tuning slqb; it already outperforms slub.
>
> No, you don't have to. I haven't merged that patch nor do I intend to do
> so until the regressions are fixed.
>
> And yes, I'm still waiting to hear from you how we're now doing with
> higher order page allocations...

General interested question -- I recently "accidentally" read some of
slub and I believe that it doesn't feature the cache colouring support
that slab did? Is that true, and if so, wasn't it needed/useful?

Rene.

2008-08-04 13:44:57