2007-11-07 01:12:43

by Christoph Lameter

[permalink] [raw]
Subject: [patch 00/23] Slab defragmentation V6

Slab defragmentation is mainly an issue if Linux is used as a fileserver
and large amounts of dentries, inodes and buffer heads accumulate. In some
load situations the slabs become very sparsely populated so that a lot of
memory is wasted by slabs that only contain one or a few objects. In
extreme cases the performance of a machine will become sluggish since
we are continually running reclaim. Slab defragmentation adds the
capability to recover wasted memory.

With lumpy reclaim slab defragmentation can be used to enhance the
ability to recover larger contiguous areas of memory. Lumpy reclaim currently
cannot do anything if a slab page is encountered. With slab defragmentation
that slab page can be removed and a large contiguous page freed. It may
be possible to have slab pages also part of ZONE_MOVABLE (Mel's defrag
scheme in 2.6.23) or the MOVABLE areas (antifrag patches in mm).

The patchset is also available via git

git pull git://git.kernel.org/pub/scm/linux/kernel/git/christoph/slab.git defrag


Currently memory reclaim from the following slab caches is possible:

1. dentry cache
2. inode cache (with a generic interface to allow easy setup of more
filesystems than the currently supported ext2/3/4 reiserfs, XFS
and proc)
3. buffer_heads

One typical mechanism that triggers slab defragmentation on my systems
is the daily run of

updatedb

Updatedb scans all files on the system which causes a high inode and dentry
use. After updatedb is complete we need to go back to the regular use
patterns (typical on my machine: kernel compiles). Those need the memory now
for different purposes. The inodes and dentries used for updatedb will
gradually be aged by the dentry/inode reclaim algorithm which will free
up the dentries and inode entries randomly through the slabs that were
allocated. As a result the slabs will become sparsely populated. If they
become empty then they can be freed but a lot of them will remain sparsely
populated. That is where slab defrag comes in: It removes the slabs with
just a few entries reclaiming more memory for other uses.

V5->V6
- Rediff against 2.6.24-rc2 + mm slub patches.
- Add reviewed by lines.
- Take out the experimental code to make slab pages movable. That
has to wait until this has been considered by Mel.

V4->V5:
- Support lumpy reclaim for slabs
- Support reclaim via slab_shrink()
- Add constructors to insure a consistent object state at all times.

V3->V4:
- Optimize scan for slabs that need defragmentation
- Add /sys/slab/*/defrag_ratio to allow setting defrag limits
per slab.
- Add support for buffer heads.
- Describe how the cleanup after the daily updatedb can be
improved by slab defragmentation.

V2->V3
- Support directory reclaim
- Add infrastructure to trigger defragmentation after slab shrinking if we
have slabs with a high degree of fragmentation.

V1->V2
- Clean up control flow using a state variable. Simplify API. Back to 2
functions that now take arrays of objects.
- Inode defrag support for a set of filesystems
- Fix up dentry defrag support to work on negative dentries by adding
a new dentry flag that indicates that a dentry is not in the process
of being freed or allocated.

--


2007-11-07 08:37:42

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [patch 00/23] Slab defragmentation V6

On Tue, 2007-11-06 at 17:11 -0800, Christoph Lameter wrote:
> Slab defragmentation

I thought we had agreed that: targeted reclaim, was a more suitable term
for this work as it does not do compaction.


2007-11-07 18:04:31

by Christoph Lameter

[permalink] [raw]
Subject: Re: [patch 00/23] Slab defragmentation V6

On Wed, 7 Nov 2007, Peter Zijlstra wrote:

> On Tue, 2007-11-06 at 17:11 -0800, Christoph Lameter wrote:
> > Slab defragmentation
>
> I thought we had agreed that: targeted reclaim, was a more suitable term
> for this work as it does not do compaction.

It does compaction by removing less populated slabs. If a kick method
reallocates the object instead of reclaiming it then this patchset fully
support compaction. Its just that the kick methods are simplistic at this
point.

2007-11-08 15:27:16

by Mel Gorman

[permalink] [raw]
Subject: Re: [patch 00/23] Slab defragmentation V6

On Tue, 2007-11-06 at 17:11 -0800, Christoph Lameter wrote:
> Slab defragmentation is mainly an issue if Linux is used as a fileserver

Was hoping this would get renamed to SLUB Targetted Reclaim from
discussions at VM Summit. As no copying is taking place, it's confusing
to call it defragmentation to me anyway. Not a major deal but it made
reading the patches a little confusing.

> and large amounts of dentries, inodes and buffer heads accumulate. In some
> load situations the slabs become very sparsely populated so that a lot of
> memory is wasted by slabs that only contain one or a few objects. In
> extreme cases the performance of a machine will become sluggish since
> we are continually running reclaim. Slab defragmentation adds the
> capability to recover wasted memory.
>

When reading this first, I expected to find how slab objects get copied
around and packed which is my problem with the defragmentation name.
Again, not really that relevant to the code.

> With lumpy reclaim slab defragmentation can be used to enhance the
> ability to recover larger contiguous areas of memory. Lumpy reclaim currently
> cannot do anything if a slab page is encountered. With slab defragmentation
> that slab page can be removed and a large contiguous page freed. It may
> be possible to have slab pages also part of ZONE_MOVABLE (Mel's defrag
> scheme in 2.6.23)

More terminology nit-pick - ZONE_MOVABLE is not defragmenting anything.
It's just partitioning memory. The slab pages need to be 100%
reclaimable or movable for that to happen but even with targetted
reclaim, some dentries such as the root directory one cannot be
reclaimed, right?

>
> or the MOVABLE areas (antifrag patches in mm).
>

It'd still be valid to leave them as MIGRATE_RECLAIMABLE because that is
what they are. Arguably, MIGRATE_RECLAIMABLE could be dropped in it's
entirety but I'd rather not as reclaimable blocks have significantly
different reclaim costs to pages that are currently marked movable.

> The patchset is also available via git
>
> git pull git://git.kernel.org/pub/scm/linux/kernel/git/christoph/slab.git defrag
>
>
> Currently memory reclaim from the following slab caches is possible:
>
> 1. dentry cache
> 2. inode cache (with a generic interface to allow easy setup of more
> filesystems than the currently supported ext2/3/4 reiserfs, XFS
> and proc)
> 3. buffer_heads
>
> One typical mechanism that triggers slab defragmentation on my systems
> is the daily run of
>
> updatedb
>
> Updatedb scans all files on the system which causes a high inode and dentry
> use. After updatedb is complete we need to go back to the regular use
> patterns (typical on my machine: kernel compiles). Those need the memory now
> for different purposes. The inodes and dentries used for updatedb will
> gradually be aged by the dentry/inode reclaim algorithm which will free
> up the dentries and inode entries randomly through the slabs that were
> allocated. As a result the slabs will become sparsely populated. If they
> become empty then they can be freed but a lot of them will remain sparsely
> populated. That is where slab defrag comes in: It removes the slabs with
> just a few entries reclaiming more memory for other uses.
>
> V5->V6
> - Rediff against 2.6.24-rc2 + mm slub patches.
> - Add reviewed by lines.
> - Take out the experimental code to make slab pages movable. That
> has to wait until this has been considered by Mel.
>

I still haven't considered them properly. I've been backlogged for I
don't know how long at this point and this is on the increasingly large
todo list :( . I don't believe it is massively urgent at the moment
though and reclaiming to start with is perfectly adequate just as lumpy
reclaim is fine at the moment.

> V4->V5:
> - Support lumpy reclaim for slabs
> - Support reclaim via slab_shrink()
> - Add constructors to insure a consistent object state at all times.
>
> V3->V4:
> - Optimize scan for slabs that need defragmentation
> - Add /sys/slab/*/defrag_ratio to allow setting defrag limits
> per slab.
> - Add support for buffer heads.
> - Describe how the cleanup after the daily updatedb can be
> improved by slab defragmentation.
>
> V2->V3
> - Support directory reclaim
> - Add infrastructure to trigger defragmentation after slab shrinking if we
> have slabs with a high degree of fragmentation.
>
> V1->V2
> - Clean up control flow using a state variable. Simplify API. Back to 2
> functions that now take arrays of objects.
> - Inode defrag support for a set of filesystems
> - Fix up dentry defrag support to work on negative dentries by adding
> a new dentry flag that indicates that a dentry is not in the process
> of being freed or allocated.
>

2007-11-08 16:01:26

by Lee Schermerhorn

[permalink] [raw]
Subject: Plans for Onezonelist patch series ???

Mel [anyone?]

Do you know what the plans are for your "onezonelist" patch series?

Are they going into -mm for, maybe, .25? Or have they been dropped.

I carry the last posting in my mempolicy tree--sometimes below my
patches; sometimes above. Our patches touch some of the same places in
mempolicy.c and require reject resolution when changing the order. I
can save Andrew some work if I knew that your patches were going to be
in the next -mm by holding off and doing the rebase myself.

Regards,
Lee

2007-11-08 18:34:41

by Christoph Lameter

[permalink] [raw]
Subject: Re: Plans for Onezonelist patch series ???

On Thu, 8 Nov 2007, Lee Schermerhorn wrote:

> Do you know what the plans are for your "onezonelist" patch series?

I wonder too whats going on? I thought they were ready for merging but I
did not see a repost after the last round of comments.

2007-11-08 18:39:40

by mel

[permalink] [raw]
Subject: Re: Plans for Onezonelist patch series ???

On (08/11/07 11:01), Lee Schermerhorn didst pronounce:
> Mel [anyone?]
>
> Do you know what the plans are for your "onezonelist" patch series?
>

I was holding off trying to add new features to current mainline or -mm as
there were a number of stability issues and one-zonelist touches a number
of areas. Minimally, I was waiting for another -mm to come out and rebase
to that. I'll rebase to latest git tomorrow, see how that looks and post
it if passes regression tests on Monday.

> Are they going into -mm for, maybe, .25? Or have they been dropped.
>
> I carry the last posting in my mempolicy tree--sometimes below my
> patches; sometimes above. Our patches touch some of the same places in
> mempolicy.c and require reject resolution when changing the order. I
> can save Andrew some work if I knew that your patches were going to be
> in the next -mm by holding off and doing the rebase myself.
>

The one-zonelist stuff is likely to be more controversial than what you
are doing. It may be best if the one-zonelist patches are based on top
of yours than the other way around.

Thanks

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab

2007-11-08 18:43:20

by mel

[permalink] [raw]
Subject: Re: Plans for Onezonelist patch series ???

On (08/11/07 10:34), Christoph Lameter didst pronounce:
> On Thu, 8 Nov 2007, Lee Schermerhorn wrote:
>
> > Do you know what the plans are for your "onezonelist" patch series?
>
> I wonder too whats going on? I thought they were ready for merging but I
> did not see a repost after the last round of comments.
>

There was two bugs that were resolved but I didn't repost after that as
mainline + -mm had gone to hell in a hand-basket and I didn't want to
add to the mess.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab

2007-11-08 18:44:09

by Christoph Lameter

[permalink] [raw]
Subject: Re: Plans for Onezonelist patch series ???

On Thu, 8 Nov 2007, Mel Gorman wrote:

> There was two bugs that were resolved but I didn't repost after that as
> mainline + -mm had gone to hell in a hand-basket and I didn't want to
> add to the mess.

Hell? I must have missed it.

2007-11-08 19:12:58

by Christoph Lameter

[permalink] [raw]
Subject: Re: [patch 00/23] Slab defragmentation V6

On Thu, 8 Nov 2007, Mel Gorman wrote:

> On Tue, 2007-11-06 at 17:11 -0800, Christoph Lameter wrote:
> > Slab defragmentation is mainly an issue if Linux is used as a fileserver
>
> Was hoping this would get renamed to SLUB Targetted Reclaim from
> discussions at VM Summit. As no copying is taking place, it's confusing
> to call it defragmentation to me anyway. Not a major deal but it made
> reading the patches a little confusing.

The problem is that people are focusing on one feature here and forget
about the rest. Targetted reclaim is one feature that was added later when
lumpy reclaim was added to the kernel. The primary intend of this patchset
was always to reduce the fragmentation. The name is appropriate and the
patchset will support copying of objects as soon as support for that is
added to the kick(). In that case the copying you are looking for will be
there. The simple implementation for the kick() methods is to simply copy
pieces of the reclaim code. That is what is included here.

> > With lumpy reclaim slab defragmentation can be used to enhance the
> > ability to recover larger contiguous areas of memory. Lumpy reclaim currently
> > cannot do anything if a slab page is encountered. With slab defragmentation
> > that slab page can be removed and a large contiguous page freed. It may
> > be possible to have slab pages also part of ZONE_MOVABLE (Mel's defrag
> > scheme in 2.6.23)
>
> More terminology nit-pick - ZONE_MOVABLE is not defragmenting anything.
> It's just partitioning memory. The slab pages need to be 100%
> reclaimable or movable for that to happen but even with targetted
> reclaim, some dentries such as the root directory one cannot be
> reclaimed, right?

100%? I am so fond of these categorical statements ....

ZONE_MOVABLE also contains mlocked pages that are also not reclaimable.
The question is at what level would it be possible to make them MOVABLE?
It may take some improvements to the kick() methods to make eviction more
reliable. Allowing the moving of objects in the kick() methods will
likely get usthere.

> It'd still be valid to leave them as MIGRATE_RECLAIMABLE because that is
> what they are. Arguably, MIGRATE_RECLAIMABLE could be dropped in it's
> entirety but I'd rather not as reclaimable blocks have significantly
> different reclaim costs to pages that are currently marked movable.

Right. That would simplify the antifrag methods. Is there any way to
measure the reclaim costs?

> > V5->V6
> > - Rediff against 2.6.24-rc2 + mm slub patches.
> > - Add reviewed by lines.
> > - Take out the experimental code to make slab pages movable. That
> > has to wait until this has been considered by Mel.
> >
>
> I still haven't considered them properly. I've been backlogged for I
> don't know how long at this point and this is on the increasingly large
> todo list :( . I don't believe it is massively urgent at the moment
> though and reclaiming to start with is perfectly adequate just as lumpy
> reclaim is fine at the moment.

Right. We can defer this for now.

2007-11-08 19:39:39

by Christoph Lameter

[permalink] [raw]
Subject: Re: Plans for Onezonelist patch series ???

On Thu, 8 Nov 2007, Mel Gorman wrote:

> I was holding off trying to add new features to current mainline or -mm as
> there were a number of stability issues and one-zonelist touches a number
> of areas. Minimally, I was waiting for another -mm to come out and rebase
> to that. I'll rebase to latest git tomorrow, see how that looks and post
> it if passes regression tests on Monday.

Ahh. Great. I am also impatiently waiting for that patchset. I was tempted
several times this week to just pick up where you left off...

2007-11-08 20:06:22

by mel

[permalink] [raw]
Subject: Re: Plans for Onezonelist patch series ???

On (08/11/07 10:43), Christoph Lameter didst pronounce:
> On Thu, 8 Nov 2007, Mel Gorman wrote:
>
> > There was two bugs that were resolved but I didn't repost after that as
> > mainline + -mm had gone to hell in a hand-basket and I didn't want to
> > add to the mess.
>
> Hell? I must have missed it.
>

Some time after rc1, things appeared in a mess - at least I didn't have
much luck figuring out what was going on when I looked. Admittedly, being
very ill at the time I didn't spend much effort on it. Either way things
were churning enough that there seemed to be enough going on without adding
one-zonelist to the mix.

I've rebased the patches to mm-broken-out-2007-11-06-02-32. However, the
vanilla -mm and the one with onezonelist applied are locking up in the
same manner. I'm way too behind at the moment to guess if it is a new bug
or reported already. At best, I can say the patches are not making things
any worse :) I'll go through the archives in the morning and do a bit more
testing to see what happens.

In case this is familiar to people, the lockup I see is;

[ 115.548908] BUG: spinlock bad magic on CPU#0, sshd/2752
[ 115.611371] lock: c20029c8, .magic: ffffffff, .owner: <none>/-1, .owner_cpu: -1066669496
[ 115.709027] [<c010526a>] show_trace_log_lvl+0x1a/0x30
[ 115.770560] [<c0105c02>] show_trace+0x12/0x20
[ 115.823787] [<c0105d16>] dump_stack+0x16/0x20
[ 115.877011] [<c022c226>] spin_bug+0x96/0xf0
[ 115.928172] [<c022c429>] _raw_spin_lock+0x69/0x140
[ 115.986580] [<c033f05f>] _spin_lock+0x4f/0x60
[ 116.039809] [<c0224bae>] kobject_add+0x4e/0x1a0
[ 116.095112] [<c01314b4>] uids_user_create+0x54/0x80
[ 116.154555] [<c01318e2>] alloc_uid+0xd2/0x150
[ 116.207784] [<c01356db>] set_user+0x2b/0xb0
[ 116.258951] [<c01373c1>] sys_setreuid+0x141/0x150
[ 116.316305] [<c010429e>] syscall_call+0x7/0xb
[ 116.369544] =======================
[ 127.680346] BUG: soft lockup - CPU#0 stuck for 11s! [sshd:2752]
[ 127.750987]
[ 127.768781] Pid: 2752, comm: sshd Not tainted (2.6.24-rc1-mm1 #1)
[ 127.841498] EIP: 0060:[<c02298c1>] EFLAGS: 00000246 CPU: 0
[ 127.906948] EIP is at delay_tsc+0x1/0x20
[ 127.953754] EAX: 00000001 EBX: c20029c8 ECX: b0953e83 EDX: 2b35cb6d
[ 128.028533] ESI: 04b81d83 EDI: 00000000 EBP: c30f1ec4 ESP: c30f1ebc
[ 128.103305] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[ 128.167717] CR0: 80050033 CR2: b7e45544 CR3: 02153000 CR4: 00000690
[ 128.242490] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 128.317253] DR6: ffff0ff0 DR7: 00000400
[ 128.363010] [<c010526a>] show_trace_log_lvl+0x1a/0x30
[ 128.424541] [<c0105c02>] show_trace+0x12/0x20
[ 128.477767] [<c010250c>] show_regs+0x1c/0x20
[ 128.529948] [<c015ab6b>] softlockup_tick+0x11b/0x150
[ 128.590446] [<c0130fb2>] run_local_timers+0x12/0x20
[ 128.649888] [<c0131182>] update_process_times+0x42/0x90
[ 128.713461] [<c01440f5>] tick_periodic+0x25/0x80
[ 128.769812] [<c0144169>] tick_handle_periodic+0x19/0x80
[ 128.833397] [<c0107519>] timer_interrupt+0x49/0x50
[ 128.891802] [<c015aeb8>] handle_IRQ_event+0x28/0x60
[ 128.951246] [<c015c7f8>] handle_level_irq+0x78/0xe0
[ 129.010693] [<c01065f0>] do_IRQ+0x40/0x80
[ 129.059782] [<c0104c5f>] common_interrupt+0x23/0x28
[ 129.119229] [<c022c472>] _raw_spin_lock+0xb2/0x140
[ 129.177635] [<c033f05f>] _spin_lock+0x4f/0x60
[ 129.230851] [<c0224bae>] kobject_add+0x4e/0x1a0
[ 129.286175] [<c01314b4>] uids_user_create+0x54/0x80
[ 129.345598] [<c01318e2>] alloc_uid+0xd2/0x150
[ 129.398830] [<c01356db>] set_user+0x2b/0xb0
[ 129.450005] [<c01373c1>] sys_setreuid+0x141/0x150
[ 129.507380] [<c010429e>] syscall_call+0x7/0xb
[ 129.560605] =======================


--
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab

2007-11-08 20:20:22

by Christoph Lameter

[permalink] [raw]
Subject: Re: Plans for Onezonelist patch series ???

On Thu, 8 Nov 2007, Mel Gorman wrote:

> I've rebased the patches to mm-broken-out-2007-11-06-02-32. However, the
> vanilla -mm and the one with onezonelist applied are locking up in the
> same manner. I'm way too behind at the moment to guess if it is a new bug
> or reported already. At best, I can say the patches are not making things
> any worse :) I'll go through the archives in the morning and do a bit more
> testing to see what happens.

I usually base my patches on Linus' tree as long as there is no tree
available from Andrew. But that means that may have to
approximate what is in there by adding this and that.

2007-11-08 20:24:44

by mel

[permalink] [raw]
Subject: Re: [patch 00/23] Slab defragmentation V6

On (08/11/07 11:12), Christoph Lameter didst pronounce:
> On Thu, 8 Nov 2007, Mel Gorman wrote:
>
> > On Tue, 2007-11-06 at 17:11 -0800, Christoph Lameter wrote:
> > > Slab defragmentation is mainly an issue if Linux is used as a fileserver
> >
> > Was hoping this would get renamed to SLUB Targetted Reclaim from
> > discussions at VM Summit. As no copying is taking place, it's confusing
> > to call it defragmentation to me anyway. Not a major deal but it made
> > reading the patches a little confusing.
>
> The problem is that people are focusing on one feature here and forget
> about the rest. Targetted reclaim is one feature that was added later when
> lumpy reclaim was added to the kernel. The primary intend of this patchset
> was always to reduce the fragmentation. The name is appropriate and the
> patchset will support copying of objects as soon as support for that is
> added to the kick(). In that case the copying you are looking for will be
> there. The simple implementation for the kick() methods is to simply copy
> pieces of the reclaim code. That is what is included here.
>

Ok, fair enough logic and it's a bit clearer in my head how to separate them
out. Thanks

> > > With lumpy reclaim slab defragmentation can be used to enhance the
> > > ability to recover larger contiguous areas of memory. Lumpy reclaim currently
> > > cannot do anything if a slab page is encountered. With slab defragmentation
> > > that slab page can be removed and a large contiguous page freed. It may
> > > be possible to have slab pages also part of ZONE_MOVABLE (Mel's defrag
> > > scheme in 2.6.23)
> >
> > More terminology nit-pick - ZONE_MOVABLE is not defragmenting anything.
> > It's just partitioning memory. The slab pages need to be 100%
> > reclaimable or movable for that to happen but even with targetted
> > reclaim, some dentries such as the root directory one cannot be
> > reclaimed, right?
>
> 100%? I am so fond of these categorical statements ....
>

Yeah, they are great for all occasions.

In fairness when the time comes, I can do a few tests using the hugepage
allocation tests with ZONE_MOVABLE and Badari might do a few tests with
memory hot-remove. Currently, the success rates for these tests are 100%
within ZONE_MOVABLE although that is without locked pages. Hot-remove
should be able to deal with locked pages but hugepage allocation wouldn't
as lumpy-reclaim would fail. If we allow slab pages to use the zone and the
success rates drop, it'll be obvious which is a plus at least.

> ZONE_MOVABLE also contains mlocked pages that are also not reclaimable.

True, but they are movable so for example memory hot-remove is able to
deal with them and the memory compaction patches should have been able
to deal with it too.

> The question is at what level would it be possible to make them MOVABLE?
> It may take some improvements to the kick() methods to make eviction more
> reliable. Allowing the moving of objects in the kick() methods will
> likely get usthere.
>

It certainly can be tried out. However, this is a future problem and
independent of the current patchset. I don't want to drag us down a blind
alley about a problem that isn't even at hand.

Right now, I think the set looks in good shape for wider testing and appears
to solve a major part of the slab fragmentation problem. Assuming I don't
fall down a hole testing one-zonelist and the mm-broken-out patches, I'll
get to testing these patches as well.

> > It'd still be valid to leave them as MIGRATE_RECLAIMABLE because that is
> > what they are. Arguably, MIGRATE_RECLAIMABLE could be dropped in it's
> > entirety but I'd rather not as reclaimable blocks have significantly
> > different reclaim costs to pages that are currently marked movable.
>
> Right. That would simplify the antifrag methods. Is there any way to
> measure the reclaim costs?
>

Regrettably, no.

> > > V5->V6
> > > - Rediff against 2.6.24-rc2 + mm slub patches.
> > > - Add reviewed by lines.
> > > - Take out the experimental code to make slab pages movable. That
> > > has to wait until this has been considered by Mel.
> > >
> >
> > I still haven't considered them properly. I've been backlogged for I
> > don't know how long at this point and this is on the increasingly large
> > todo list :( . I don't believe it is massively urgent at the moment
> > though and reclaiming to start with is perfectly adequate just as lumpy
> > reclaim is fine at the moment.
>
> Right. We can defer this for now.
>

Agreed.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab

2007-11-08 20:28:25

by Christoph Lameter

[permalink] [raw]
Subject: Re: [patch 00/23] Slab defragmentation V6

On Thu, 8 Nov 2007, Mel Gorman wrote:

> It certainly can be tried out. However, this is a future problem and
> independent of the current patchset. I don't want to drag us down a blind
> alley about a problem that isn't even at hand.

Right. That is why I took it out.

2007-11-08 20:29:47

by mel

[permalink] [raw]
Subject: Re: Plans for Onezonelist patch series ???

On (08/11/07 12:20), Christoph Lameter didst pronounce:
> On Thu, 8 Nov 2007, Mel Gorman wrote:
>
> > I've rebased the patches to mm-broken-out-2007-11-06-02-32. However, the
> > vanilla -mm and the one with onezonelist applied are locking up in the
> > same manner. I'm way too behind at the moment to guess if it is a new bug
> > or reported already. At best, I can say the patches are not making things
> > any worse :) I'll go through the archives in the morning and do a bit more
> > testing to see what happens.
>
> I usually base my patches on Linus' tree as long as there is no tree
> available from Andrew. But that means that may have to
> approximate what is in there by adding this and that.
>

Unfortunately for me, there are several collisions with the patches when
applied against -mm if the patches are based on latest git. They are mainly in
mm/vmscan.c due to the memory controller work. For the purposes of testing and
merging, it makes more sense for me to work against -mm as much as possible.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab

2007-11-08 21:20:40

by Lee Schermerhorn

[permalink] [raw]
Subject: Re: [patch 00/23] Slab defragmentation V6

On Thu, 2007-11-08 at 11:12 -0800, Christoph Lameter wrote:
> On Thu, 8 Nov 2007, Mel Gorman wrote:
>
> > On Tue, 2007-11-06 at 17:11 -0800, Christoph Lameter wrote:
> > > Slab defragmentation is mainly an issue if Linux is used as a fileserver
> >
> > Was hoping this would get renamed to SLUB Targetted Reclaim from
> > discussions at VM Summit. As no copying is taking place, it's confusing
> > to call it defragmentation to me anyway. Not a major deal but it made
> > reading the patches a little confusing.
>
> The problem is that people are focusing on one feature here and forget
> about the rest. Targetted reclaim is one feature that was added later when
> lumpy reclaim was added to the kernel. The primary intend of this patchset
> was always to reduce the fragmentation. The name is appropriate and the
> patchset will support copying of objects as soon as support for that is
> added to the kick(). In that case the copying you are looking for will be
> there. The simple implementation for the kick() methods is to simply copy
> pieces of the reclaim code. That is what is included here.
>
> > > With lumpy reclaim slab defragmentation can be used to enhance the
> > > ability to recover larger contiguous areas of memory. Lumpy reclaim currently
> > > cannot do anything if a slab page is encountered. With slab defragmentation
> > > that slab page can be removed and a large contiguous page freed. It may
> > > be possible to have slab pages also part of ZONE_MOVABLE (Mel's defrag
> > > scheme in 2.6.23)
> >
> > More terminology nit-pick - ZONE_MOVABLE is not defragmenting anything.
> > It's just partitioning memory. The slab pages need to be 100%
> > reclaimable or movable for that to happen but even with targetted
> > reclaim, some dentries such as the root directory one cannot be
> > reclaimed, right?
>
> 100%? I am so fond of these categorical statements ....
>
> ZONE_MOVABLE also contains mlocked pages that are also not reclaimable.
> The question is at what level would it be possible to make them MOVABLE?
> It may take some improvements to the kick() methods to make eviction more
> reliable. Allowing the moving of objects in the kick() methods will
> likely get usthere.

Christoph: Although mlocked pages are not reclaimable, they ARE
migratable. You fixed that a long time ago. [And I just verified with
memtoy.] Doesn't this make them "movable"?

Lee

2007-11-08 21:28:17

by Christoph Lameter

[permalink] [raw]
Subject: Re: [patch 00/23] Slab defragmentation V6

On Thu, 8 Nov 2007, Lee Schermerhorn wrote:

> > ZONE_MOVABLE also contains mlocked pages that are also not reclaimable.
> > The question is at what level would it be possible to make them MOVABLE?
> > It may take some improvements to the kick() methods to make eviction more
> > reliable. Allowing the moving of objects in the kick() methods will
> > likely get usthere.
>
> Christoph: Although mlocked pages are not reclaimable, they ARE
> migratable. You fixed that a long time ago. [And I just verified with
> memtoy.] Doesn't this make them "movable"?

I know. They are movable but not reclaimable.