2002-07-21 11:24:10

by Craig Kulesa

[permalink] [raw]
Subject: [PATCH 2/2] move slab pages to the lru, for 2.5.27



This is an update for the 2.5 port of Ed Tomlinson's patch to move slab
pages onto the lru for page aging, atop 2.5.27 and the full rmap patch.
It is aimed at being a fairer, self-tuning way to target and evict slab
pages.

Previous description:
http://mail.nl.linux.org/linux-mm/2002-07/msg00216.html
Patch URL:
http://loke.as.arizona.edu/~ckulesa/kernel/rmap-vm/2.5.27/

What's next:

This patch is intermediate between where we were (freeing slab caches
blindly and not in tune with the rest of the VM), and where we want to be
(cache pruning by page as we scan the active list looking for cold pages
to deactivate). Uhhh, well, I *think* that's where we want to be. :)

How do we get there?

Given a slab page, I can find out what cachep and slab I'm dealing with
(via GET_PAGE_SLAB and friends). If the cache is prunable one,
cachep->pruner tells me what kind of callback (dcache/inode/dquot) I
should invoke to prune the page. No problem.

The trouble comes when we try to replace shrink_dcache_memory() and
friends with slab-aware pruners. Namely, how to teach those
inode/dcache/dquot callbacks to free objects belonging to a *specified*
page or slab? If I have a dentry slab, I'd like to try to liberate
*those* dentries, not some random ones like shrink_dcache_memory does now.
I'm still trying to figure out how to make that work. Or is that
totally the wrong approach? Thoughts? ;)

[I understand Rik's working on this, but my curiosity made me ask anyway!]

Comments, fixes, & feedback always welcome. :)

Craig Kulesa
Steward Obs.
Univ. of Arizona


2002-07-21 13:30:02

by Ed Tomlinson

[permalink] [raw]
Subject: Re: [PATCH 2/2] move slab pages to the lru, for 2.5.27

On July 21, 2002 07:24 am, Craig Kulesa wrote:

> This patch is intermediate between where we were (freeing slab caches
> blindly and not in tune with the rest of the VM), and where we want to be
> (cache pruning by page as we scan the active list looking for cold pages
> to deactivate). Uhhh, well, I *think* that's where we want to be. :)
>
> How do we get there?
>
> Given a slab page, I can find out what cachep and slab I'm dealing with
> (via GET_PAGE_SLAB and friends). If the cache is prunable one,
> cachep->pruner tells me what kind of callback (dcache/inode/dquot) I
> should invoke to prune the page. No problem.
>
> The trouble comes when we try to replace shrink_dcache_memory() and
> friends with slab-aware pruners. Namely, how to teach those
> inode/dcache/dquot callbacks to free objects belonging to a *specified*
> page or slab? If I have a dentry slab, I'd like to try to liberate
> *those* dentries, not some random ones like shrink_dcache_memory does now.

Well not quite random. It prunes the oldest entries. The idea behind the prunable
callback is that some caches have specific aging methods. What I tried to do here
was keep the rate of aging in sync with the VM.

> I'm still trying to figure out how to make that work. Or is that
> totally the wrong approach? Thoughts? ;)

Thats a question I have asked myself too. What could be done is, scan the
entries in the slab encountered, using a call back, free them if they are purgeable.
If this ends up producing an empty slab, release it.

>Intermezzo has a funky dentry cache that may need a pruner method (??),
>but I didn't touch it. If there was a better way to do this, I was too
>blind to see it.

Looking at the Intermezzo dcache code, I think you made the right choise.
I do not think this needs a pruner method.

Ed Tomlinson



2002-07-21 21:23:23

by Craig Kulesa

[permalink] [raw]
Subject: Re: [PATCH 2/2] move slab pages to the lru, for 2.5.27


On Sun, 21 Jul 2002, Ed Tomlinson wrote:

> Well not quite random. It prunes the oldest entries.

Yes, you're right, I was speaking sloppily.

I meant "random" in the sense that "those pruned entries don't correspond
to the page I'm scanning right now in refill_inactive_zone". Establishing
that connection seems simultaneously very interesting and confusing. :)

> What I tried to do here was keep the rate of aging in sync with the VM.

I think it works darn well too! But maybe a page-centric freeing system
would do that too. In your patch, you distinguish between:

- aging prunable slab pages by "counting" the slab pages
to determine the rate of pruning, and using the caches' internal
lru order to determine the actual entries that get pruned
and
- aging non-prunable slabs with page->age.

Can we unify them and just use page->age (or, in the case of stock 2.5.27,
do it in the VM's LRU order)? That is, if you encounter a cold slab page
in scanning the active list, try to prune just *that* page's slab
resources. If it manages to free the slab, free the page. Otherwise,
try again next time we visit the page.

> Thats a question I have asked myself too. What could be done is, scan
> the entries in the slab encountered,

I think *that's* the part I'm having difficulty envisioning. If
cachep->pruner, then I might find myself in dcache.c (for example)
living in a pruner callback that no longer remembers "my" slab page.
Seems like we need a "dentry_lookup" method that returns a list of
[di]cache objects living in a specified slab (page). Then feed that
list to a modified prune_[di]cache and see if that frees the slab.

Not the current "prune 'N' old entries from the cache, and I don't care
where they live in memory". We're coming in instead and saying "I _know_
this page is old, so try to free its entries". This is, I suppose, saying
that we want to throw out (or at least ignore) the caches' internal LRU
aging methods and use the VM's LRU order (2.5.27), or page->age (2.5.27
plus full rmap). Uh oh. This is getting scary. And maybe wrong. :)

So, that list-returning method has me befuddled. And maybe we don't
really want to do any of this. Which is why I asked. :)

> Looking at the Intermezzo dcache code, I think you made the right choise.
> I do not think this needs a pruner method.

Whew! :)

Comments?

Craig Kulesa
Steward Obs.
Univ. of Arizona

2002-07-21 21:28:28

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH 2/2] move slab pages to the lru, for 2.5.27

On Sun, Jul 21, 2002 at 04:24:55AM -0700, Craig Kulesa wrote:
> This is an update for the 2.5 port of Ed Tomlinson's patch to move slab
> pages onto the lru for page aging, atop 2.5.27 and the full rmap patch.
> It is aimed at being a fairer, self-tuning way to target and evict slab
> pages.

In combination with the pte_chain in slab patch, this should at long last
enable reclamation of unused pte_chains after surges in demand. Can you
test this to verify that reclamation is actually done? (I'm embroiled in
a long debugging session at the moment.)


Thanks,
Bill

2002-07-21 23:12:14

by Ed Tomlinson

[permalink] [raw]
Subject: Re: [PATCH 2/2] move slab pages to the lru, for 2.5.27

On July 21, 2002 05:24 pm, Craig Kulesa wrote:
> On Sun, 21 Jul 2002, Ed Tomlinson wrote:
> > Well not quite random. It prunes the oldest entries.
>
> Yes, you're right, I was speaking sloppily.
>
> I meant "random" in the sense that "those pruned entries don't correspond
> to the page I'm scanning right now in refill_inactive_zone". Establishing
> that connection seems simultaneously very interesting and confusing. :)
>
> > What I tried to do here was keep the rate of aging in sync with the VM.
>
> I think it works darn well too! But maybe a page-centric freeing system
> would do that too. In your patch, you distinguish between:
>
> - aging prunable slab pages by "counting" the slab pages
> to determine the rate of pruning, and using the caches' internal
> lru order to determine the actual entries that get pruned
> and
> - aging non-prunable slabs with page->age.
>
> Can we unify them and just use page->age (or, in the case of stock 2.5.27,
> do it in the VM's LRU order)? That is, if you encounter a cold slab page
> in scanning the active list, try to prune just *that* page's slab
> resources. If it manages to free the slab, free the page. Otherwise,
> try again next time we visit the page.
>
> > Thats a question I have asked myself too. What could be done is, scan
> > the entries in the slab encountered,
>
> I think *that's* the part I'm having difficulty envisioning. If
> cachep->pruner, then I might find myself in dcache.c (for example)
> living in a pruner callback that no longer remembers "my" slab page.
> Seems like we need a "dentry_lookup" method that returns a list of
> [di]cache objects living in a specified slab (page). Then feed that
> list to a modified prune_[di]cache and see if that frees the slab.
>
> Not the current "prune 'N' old entries from the cache, and I don't care

Make that, if the slab is empty free it, if the cache as a pruner callback,
count the entries in it for aging later.

> where they live in memory". We're coming in instead and saying "I _know_
> this page is old, so try to free its entries". This is, I suppose, saying
> that we want to throw out (or at least ignore) the caches' internal LRU
> aging methods and use the VM's LRU order (2.5.27), or page->age (2.5.27

Exactly.

> plus full rmap). Uh oh. This is getting scary. And maybe wrong. :)
>
> So, that list-returning method has me befuddled. And maybe we don't
> really want to do any of this. Which is why I asked. :)

It actually works fairly well as currently implemented. Suspect it would work better
if we aged entries in the slabs we encounted. That being said, think it would take
a fairly major change to the dcache/icache logic to allow this to happen.

Think we would need to know, is the slab entry active, and second what list(s) is it
on. With this info we loop thru all entries in a slab, if the entry is active we call
the pruner callback to prune it if possible. If we pruned all entries free the slab.

1. How do we know if a slab entry is active?
2. How do we determine what list(s) the dentry, inode is on?

Ed Tomlinson




2002-07-22 07:07:22

by Craig Kulesa

[permalink] [raw]
Subject: Re: [PATCH 2/2] move slab pages to the lru, for 2.5.27


On Sun, 21 Jul 2002, William Lee Irwin III wrote:

> Can you test this to verify that reclamation is actually done?

Looks like it is. I have a script that does slocate, loads monstrous
blank images into gimp, manipulates it to drive the machine to swap,
performs a 'find' to test and make sure we haven't evicted slab pages too
readily, loads some addn'l applications and performs a few dbench runs.
The apps are killed in lifo order to see if the page aging works.
Afterwards, the find is performed again to see if any of the slabs are
still cached.

vmstat is running through the entire test, and /proc/[mem,slab]info are
saved very occasionally.

I ran the test on 2.5.27-rmap and 2.5.27-rmap-slablru to test
pte_chain reclaim and to see if the slablru patch causes measurable
slowdowns due to more LRU list traffic. It apparently doesn't:

Time to completion:
2.5.27: 203 sec
2.5.27-rmap: 205 sec
2.5.27-rmap-slablru: 205 sec

Swapouts during test:
2.5.27: 30092 kB
2.5.27-rmap: 43520 kB
2.5.27-rmap-slablru: 40948 kB

Swapins during test:
2.5.27: 13364 kB
2.5.27-rmap: 8616 kB
2.5.27-rmap-slablru: 8452 kB

Slab reclaim looks sane for the slablru kernel throughout the test. In
particular, here's the pte_chain pool entries through the test from
/proc/slabinfo:

pte_chain 20061 21294 8 60 63 1
pte_chain 20061 21294 8 60 63 1
pte_chain 20822 24336 8 65 72 1
pte_chain 21563 24336 8 65 72 1
pte_chain 19921 23660 8 70 70 1
pte_chain 18501 23660 8 63 70 1
pte_chain 18483 22984 8 63 68 1


Not very dramatic in this example since this was a pretty mild load, but
it does seem to work.

Craig Kulesa
Steward Obs.
Univ. of Arizona

2002-07-22 18:54:43

by Steven Cole

[permalink] [raw]
Subject: Re: [PATCH 2/2] move slab pages to the lru, for 2.5.27

On Sun, 2002-07-21 at 05:24, Craig Kulesa wrote:
>
>
> This is an update for the 2.5 port of Ed Tomlinson's patch to move slab
> pages onto the lru for page aging, atop 2.5.27 and the full rmap patch.
> It is aimed at being a fairer, self-tuning way to target and evict slab
> pages.
>
> Previous description:
> http://mail.nl.linux.org/linux-mm/2002-07/msg00216.html
> Patch URL:
> http://loke.as.arizona.edu/~ckulesa/kernel/rmap-vm/2.5.27/
>

While trying to boot 2.5.27-rmap-slablru, I got this early in the boot:

Kernel panic: Failed to create pte-chain mempool!
In idle task - not syncing

No other information was available.
I had previously booted and run 2.5.27 and 2.5.27-rmap. I had to unset
CONFIG_QUOTA to get 2.5.27-rmap-slablru to compile.
I first applied the 2.5.27-rmap-1-rmap13b patch for 2.5.27-rmap, and
then applied the 2.5.27-rmap-2-slablru patch for 2.5.27-rmap-slablru.

The test machine is a dual p3 valinux 2231. Some options from .config:

[steven@spc9 linux-2.5.27-ck]$ grep HIGH .config
# CONFIG_NOHIGHMEM is not set
CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set
# CONFIG_HIGHPTE is not set
CONFIG_HIGHMEM=y

Steven




2002-07-22 22:14:50

by Ed Tomlinson

[permalink] [raw]
Subject: Re: [PATCH 2/2] move slab pages to the lru, for 2.5.27

On July 22, 2002 02:54 pm, Steven Cole wrote:
> On Sun, 2002-07-21 at 05:24, Craig Kulesa wrote:
> > This is an update for the 2.5 port of Ed Tomlinson's patch to move slab
> > pages onto the lru for page aging, atop 2.5.27 and the full rmap patch.
> > It is aimed at being a fairer, self-tuning way to target and evict slab
> > pages.
> >
> > Previous description:
> > http://mail.nl.linux.org/linux-mm/2002-07/msg00216.html
> > Patch URL:
> > http://loke.as.arizona.edu/~ckulesa/kernel/rmap-vm/2.5.27/
>
> While trying to boot 2.5.27-rmap-slablru, I got this early in the boot:
>
> Kernel panic: Failed to create pte-chain mempool!
> In idle task - not syncing

This is not the result of slablru (rmap yes), rather it looks to be what Andrew Morton
was worried about in the "Re: pte_chain_mempool-2.5.27-1" thread. Looks like
the chunk of continous memory need for the pte_change_mempool cannot be
obtained...

Ed Tomlinson

2002-07-22 22:18:50

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH 2/2] move slab pages to the lru, for 2.5.27

On Sun, 2002-07-21 at 05:24, Craig Kulesa wrote:
>> This is an update for the 2.5 port of Ed Tomlinson's patch to move slab
>> pages onto the lru for page aging, atop 2.5.27 and the full rmap patch.
>> It is aimed at being a fairer, self-tuning way to target and evict slab
>> pages.

On Mon, Jul 22, 2002 at 12:54:28PM -0600, Steven Cole wrote:
> While trying to boot 2.5.27-rmap-slablru, I got this early in the boot:
> Kernel panic: Failed to create pte-chain mempool!
> In idle task - not syncing

The pte_chain mempool was ridiculously huge and the use of mempool for
this at all was in error.


Cheers,
Bill

2002-07-22 22:35:49

by Craig Kulesa

[permalink] [raw]
Subject: Re: [PATCH 2/2] move slab pages to the lru, for 2.5.27


On Mon, 22 Jul 2002, William Lee Irwin III wrote:

> The pte_chain mempool was ridiculously huge and the use of mempool for
> this at all was in error.

That's what I thoguht too -- but Steven tried making the pool 1/4th the
size and it still failed. OTOH, he tried 2.5.27-rmap, which uses the
*same mempool patch* and he had no problem with the monster 128KB
allocation. Maybe it was all luck. :) I can't yet see anything in the
slablru patch that has anything to do with it...

On another note -- Steven did point out that the slablru patch has a
patchbug with regards to dquot.c. I think this error is also in Ed's
June 5th patch (at least as posted), and I didn't catch it.
I believe that:

shrink_dqcache_memory(int priority, unsigned int gfp_mask)
needs to be
age_dqcache_memory(kmem_cache_t *cachep, int entries, int gfp_mask)

in dquot.c. It'll be tested and fixed on the next go. :)

Best regards,
Craig Kulesa

2002-07-23 00:19:05

by Ed Tomlinson

[permalink] [raw]
Subject: Re: [PATCH 2/2] move slab pages to the lru, for 2.5.27

Craig Kulesa wrote:

> On another note -- Steven did point out that the slablru patch has a
> patchbug with regards to dquot.c. I think this error is also in Ed's
> June 5th patch (at least as posted), and I didn't catch it.
> I believe that:
>
> shrink_dqcache_memory(int priority, unsigned int gfp_mask)
> needs to be
> age_dqcache_memory(kmem_cache_t *cachep, int entries, int gfp_mask)
>
> in dquot.c. It'll be tested and fixed on the next go. :)

Right. Fixed in the linux-2.4-rmap bk tree with slablru at casa.dyndns.org:3334

Thanks
Ed Tomlinson

2002-07-23 04:33:52

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [PATCH 2/2] move slab pages to the lru, for 2.5.27

On Mon, 22 Jul 2002, William Lee Irwin III wrote:
>> The pte_chain mempool was ridiculously huge and the use of mempool for
>> this at all was in error.

On Mon, Jul 22, 2002 at 03:36:33PM -0700, Craig Kulesa wrote:
> That's what I thoguht too -- but Steven tried making the pool 1/4th the
> size and it still failed. OTOH, he tried 2.5.27-rmap, which uses the
> *same mempool patch* and he had no problem with the monster 128KB
> allocation. Maybe it was all luck. :) I can't yet see anything in the
> slablru patch that has anything to do with it...

While waiting for the other machine to boot I tried these out. There
appear to be bootstrap ordering problems either introduced by or
exposed by this patch:

Cheers,
Bill


[log buffer too small to capture the whole thing]

CPU0<T0:99984,T1:96944,D:10,S:3030,C:99992>
cpu: 1, clocks: 99992, slice: 3030
cpu: 2, clocks: 99992, slice: 3030
cpu: 3, clocks: 99992, slice: 3030
cpu: 4, clocks: 99992, slice: 3030
cpu: 5, clocks: 99992, slice: 3030
cpu: 7, clocks: 99992, slice: 3030
cpu: 6, clocks: 99992, slice: 3030
cpu: 9, clocks: 99992, slice: 3030
cpu: 10, clocks: 99992, slice: 3030
cpu: 11, clocks: 99992, slice: 3030
cpu: 8, clocks: 99992, slice: 3030
cpu: 14, clocks: 99992, slice: 3030
cpu: 15, clocks: 99992, slice: 3030
cpu: 12, clocks: 99992, slice: 3030
cpu: 13, clocks: 99992, slice: 3030
CPU2<T0:99984,T1:90880,D:14,S:3030,C:99992>
CPU3<T0:99984,T1:87856,D:8,S:3030,C:99992>
CPU5<T0:99984,T1:81792,D:12,S:3030,C:99992>
CPU7<T0:99984,T1:75744,D:0,S:3030,C:99992>
CPU4<T0:99984,T1:84832,D:2,S:3030,C:99992>
CPU6<T0:99984,T1:78768,D:6,S:3030,C:99992>
CPU12<T0:99984,T1:60592,D:2,S:3030,C:99992>
CPU10<T0:99984,T1:66640,D:14,S:3030,C:99992>
CPU11<T0:99984,T1:63616,D:8,S:3030,C:99992>
CPU8<T0:99984,T1:72704,D:10,S:3030,C:99992>
CPU9<T0:99984,T1:69680,D:4,S:3030,C:99992>
CPU15<T0:99984,T1:51504,D:0,S:3030,C:99992>
CPU13<T0:99984,T1:57552,D:12,S:3030,C:99992>
CPU14<T0:99984,T1:54528,D:6,S:3030,C:99992>
CPU1<T0:99984,T1:93920,D:4,S:3030,C:99992>
checking TSC synchronization across CPUs:
BIOS BUG: CPU#0 improperly initialized, has 7692500 usecs TSC skew! FIXED.
BIOS BUG: CPU#1 improperly initialized, has 7692500 usecs TSC skew! FIXED.
BIOS BUG: CPU#2 improperly initialized, has 7692500 usecs TSC skew! FIXED.
BIOS BUG: CPU#3 improperly initialized, has 7692500 usecs TSC skew! FIXED.
BIOS BUG: CPU#4 improperly initialized, has 7750408 usecs TSC skew! FIXED.
BIOS BUG: CPU#5 improperly initialized, has 7750408 usecs TSC skew! FIXED.
BIOS BUG: CPU#6 improperly initialized, has 7750408 usecs TSC skew! FIXED.
BIOS BUG: CPU#7 improperly initialized, has 7750408 usecs TSC skew! FIXED.
BIOS BUG: CPU#8 improperly initialized, has 7773209 usecs TSC skew! FIXED.
BIOS BUG: CPU#9 improperly initialized, has 7773162 usecs TSC skew! FIXED.
BIOS BUG: CPU#10 improperly initialized, has 7773162 usecs TSC skew! FIXED.
BIOS BUG: CPU#11 improperly initialized, has 7773161 usecs TSC skew! FIXED.
BIOS BUG: CPU#12 improperly initialized, has -23216067 usecs TSC skew! FIXED.
BIOS BUG: CPU#13 improperly initialized, has -23216089 usecs TSC skew! FIXED.
BIOS BUG: CPU#14 improperly initialized, has -23216089 usecs TSC skew! FIXED.
BIOS BUG: CPU#15 improperly initialized, has -23216089 usecs TSC skew! FIXED.
migration_task 0 on cpu=0
migration_task 1 on cpu=1
migration_task 2 on cpu=2
migration_task 3 on cpu=3
migration_task 4 on cpu=4
migration_task 5 on cpu=5
migration_task 6 on cpu=6
migration_task 7 on cpu=7
migration_task 8 on cpu=8
migration_task 9 on cpu=9
migration_task 10 on cpu=10
migration_task 11 on cpu=11
migration_task 12 on cpu=12
migration_task 13 on cpu=13
migration_task 14 on cpu=14
migration_task 15 on cpu=15
Linux NET4.0 for Linux 2.4
Based upon Swansea University Computer Society NET3.039
Initializing RT netlink socket
PCI: PCI BIOS revision 2.10 entry at 0xfd231, last bus=2
PCI: Using configuration type 1
isapnp: Scanning for PnP cards...
isapnp: No Plug & Play device found
PCI: Probing PCI hardware (bus 00)
PCI: Searching for i450NX host bridges on 00:10.0
Unknown bridge resource 2: assuming transparent
Scanning PCI bus 3 for quad 1
Scanning PCI bus 6 for quad 2
Scanning PCI bus 9 for quad 3
PCI->APIC IRQ transform: (B0,I10,P0) -> 23
PCI->APIC IRQ transform: (B0,I11,P0) -> 19
PCI->APIC IRQ transform: (B2,I15,P0) -> 28
PCI: using PPB(B0,I12,P0) to get irq 15
PCI->APIC IRQ transform: (B1,I4,P0) -> 15
PCI: using PPB(B0,I12,P1) to get irq 13
PCI->APIC IRQ transform: (B1,I5,P1) -> 13
PCI: using PPB(B0,I12,P2) to get irq 11
PCI->APIC IRQ transform: (B1,I6,P2) -> 11
PCI: using PPB(B0,I12,P3) to get irq 7
PCI->APIC IRQ transform: (B1,I7,P3) -> 7
enable_cpucache failed for dentry_cache, error 12.
enable_cpucache failed for filp, error 12.
enable_cpucache failed for names_cache, error 12.
enable_cpucache failed for buffer_head, error 12.
enable_cpucache failed for mm_struct, error 12.
enable_cpucache failed for vm_area_struct, error 12.
enable_cpucache failed for fs_cache, error 12.
enable_cpucache failed for files_cache, error 12.
enable_cpucache failed for signal_act, error 12.
enable_cpucache failed for task_struct, error 12.
enable_cpucache failed for pte_chain, error 12.
enable_cpucache failed for pae_pgd, error 12.
enable_cpucache failed for size-4096 (DMA), error 12.
enable_cpucache failed for size-4096, error 12.
enable_cpucache failed for size-2048 (DMA), error 12.
enable_cpucache failed for size-2048, error 12.
enable_cpucache failed for size-1024 (DMA), error 12.
enable_cpucache failed for size-1024, error 12.
enable_cpucache failed for size-512 (DMA), error 12.
enable_cpucache failed for size-512, error 12.
enable_cpucache failed for size-256 (DMA), error 12.
enable_cpucache failed for size-256, error 12.
enable_cpucache failed for size-192 (DMA), error 12.
enable_cpucache failed for size-192, error 12.
enable_cpucache failed for size-128 (DMA), error 12.
enable_cpucache failed for size-128, error 12.
enable_cpucache failed for size-96 (DMA), error 12.
enable_cpucache failed for size-96, error 12.
enable_cpucache failed for size-64 (DMA), error 12.
enable_cpucache failed for size-64, error 12.
enable_cpucache failed for size-32 (DMA), error 12.
enable_cpucache failed for size-32, error 12.
Starting kswapd
enable_cpucache failed for shmem_inode_cache, error 12.
could not kern_mount tmpfs

2002-07-23 11:42:54

by Ed Tomlinson

[permalink] [raw]
Subject: Re: [PATCH 2/2] move slab pages to the lru, for 2.5.27

On July 23, 2002 12:36 am, William Lee Irwin III wrote:
> On Mon, 22 Jul 2002, William Lee Irwin III wrote:
> >> The pte_chain mempool was ridiculously huge and the use of mempool for
> >> this at all was in error.
>
> On Mon, Jul 22, 2002 at 03:36:33PM -0700, Craig Kulesa wrote:
> > That's what I thoguht too -- but Steven tried making the pool 1/4th the
> > size and it still failed. OTOH, he tried 2.5.27-rmap, which uses the
> > *same mempool patch* and he had no problem with the monster 128KB
> > allocation. Maybe it was all luck. :) I can't yet see anything in the
> > slablru patch that has anything to do with it...
>
> While waiting for the other machine to boot I tried these out. There
> appear to be bootstrap ordering problems either introduced by or
> exposed by this patch:

I would vote for ordering. The slab init code was changed to initialize
new fields... Allocating memory for slabs is another story. They depend on
the lru lists and the pagemap_lru_lock being setup... Has this happened
when slab storage is initialized? If not a call to do this in the slab init logic
would fix things. It could also be fixed using this fragment (slab.c):

+ /*
+ * We want the pagemap_lru_lock, in UP spin locks to not
+ * protect us in interrupt context... In SMP they do but,
+ * optimizating for speed, we process if we do not get it.
+ */
+ if (!(cachep->flags & SLAB_NO_REAP)) {
+#ifdef CONFIG_SMP
+ locked = spin_trylock(&pagemap_lru_lock);
+#else
+ locked = !in_interrupt() && spin_trylock(&pagemap_lru_lock);
+#endif
+ if (!locked && !in_interrupt())
+ goto opps1;

If there is some way to verify that the pagemap_lru_lock is ready. Note its
fine to just set locked to 0 and proceed - as long as this condition does not
last forever. Also this code is in a fastpath so efficient is good...

Thoughts?
Ed Tomlinson




2002-07-23 14:31:21

by Steven Cole

[permalink] [raw]
Subject: Re: [PATCH 2/2] move slab pages to the lru, for 2.5.27

On Mon, 2002-07-22 at 16:36, Craig Kulesa wrote:
>
> On Mon, 22 Jul 2002, William Lee Irwin III wrote:
>
> > The pte_chain mempool was ridiculously huge and the use of mempool for
> > this at all was in error.
>
[snipped]
>
> in dquot.c. It'll be tested and fixed on the next go. :)

1st the good news. The 2.5.27-rmap-2b-dqcache patch fixed the compile
problem with CONFIG_QUOTA=y.

Then, I patched in 2.5.27-rmap-3-slaballoc from Craig's site and the
test machine got much further in the boot, but hung up here:

Starting cron daemon
/etc/rc.d/rc3.d/S50inet: fork: Cannot allocate memory

Sorry, no further information was available.

Steven

2002-07-24 20:29:05

by Steven Cole

[permalink] [raw]
Subject: Re: [PATCH 2/2] move slab pages to the lru, for 2.5.27

On Tue, 2002-07-23 at 08:31, I (Steven Cole) wrote:
> On Mon, 2002-07-22 at 16:36, Craig Kulesa wrote:
> >
> > On Mon, 22 Jul 2002, William Lee Irwin III wrote:
> >
> > > The pte_chain mempool was ridiculously huge and the use of mempool for
> > > this at all was in error.
> >
> [snipped]
> >
> > in dquot.c. It'll be tested and fixed on the next go. :)
>
> 1st the good news. The 2.5.27-rmap-2b-dqcache patch fixed the compile
> problem with CONFIG_QUOTA=y.
>
> Then, I patched in 2.5.27-rmap-3-slaballoc from Craig's site and the
> test machine got much further in the boot, but hung up here:
>
> Starting cron daemon
> /etc/rc.d/rc3.d/S50inet: fork: Cannot allocate memory
>
> Sorry, no further information was available.

I finally got some time for more testing, and I booted this very same
2.5.25-rmap-slablru kernel on the same machine, and this time it booted
just fine. Then I began to exercise the box a little by running dbench
with increasing numbers of clients. At 28 clients, I got this:

(31069) open CLIENTS/CLIENT16/~DMTMP/WORDPRO/BENCHS1.PRN failed for handle 4148 (Cannot allocate memory)
(31070) nb_close: handle 4148 was not open
(31073) unlink CLIENTS/CLIENT16/~DMTMP/WORDPRO/BENCHS1.PRN failed (No such file or directory)

Right after starting 32 dbench clients, the box locked up, no longer
responding to the keyboard. It did respond to pings, but nothing else.

This hardware does run other kernels successfully, most recently
2.4.19-rc3-ac3 and dbench 128 (load over 100).

Steven


2002-07-24 21:02:49

by Steven Cole

[permalink] [raw]
Subject: Re: [PATCH 2/2] move slab pages to the lru, for 2.5.27

On Wed, 2002-07-24 at 14:28, Steven Cole wrote:
[snipped]
> I finally got some time for more testing, and I booted this very same
> 2.5.25-rmap-slablru kernel on the same machine, and this time it booted
2.5.27-rmap-slablru I meant to say.
> just fine. Then I began to exercise the box a little by running dbench
> with increasing numbers of clients. At 28 clients, I got this:
On closer inspection, these errors began at 6 clients.
>
> (31069) open CLIENTS/CLIENT16/~DMTMP/WORDPRO/BENCHS1.PRN failed for handle 4148 (Cannot allocate memory)
> (31070) nb_close: handle 4148 was not open
> (31073) unlink CLIENTS/CLIENT16/~DMTMP/WORDPRO/BENCHS1.PRN failed (No such file or directory)
>
> Right after starting 32 dbench clients, the box locked up, no longer
> responding to the keyboard. It did respond to pings, but nothing else.
>
> This hardware does run other kernels successfully, most recently
> 2.4.19-rc3-ac3 and dbench 128 (load over 100).

I then tried rebooting 2.5.27-rmap-slablru with /home mounted as ext3,
and immediately after starting dbench 1, I got this message about 10
times or so:

ENOMEM in do_get_write_access, retrying.

And the box was locked up. Next time, I'll have CONFIG_MAGIC_SYSRQ=y.
Meanwhile, it is running the dbench 1 to 64 series under 2.4.19-rc3 with
no problems at all.

Steven

2002-07-25 00:10:18

by Ed Tomlinson

[permalink] [raw]
Subject: Re: [PATCH 2/2] move slab pages to the lru, for 2.5.27

Hi,

This patch fixes the SMP problems for Steve. It was a thinko I put in to check things... In SMP
it was just plain broken. Patch is against 2.4.26 with Craig's patches but works on 2.5.27 and
2.4.x too. My bk tree with slablru based on linux-2.4-rmap at casa.dyndns.org:3334 has also
been updated.

Thanks,
Ed Tomlinson


# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
# ChangeSet 1.430 -> 1.431
# mm/slab.c 1.23 -> 1.24
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 02/07/24 [email protected] 1.431
# Prevent false out of memory reporting in SMP
# --------------------------------------------
#
diff -Nru a/mm/slab.c b/mm/slab.c
--- a/mm/slab.c Wed Jul 24 17:22:31 2002
+++ b/mm/slab.c Wed Jul 24 17:22:31 2002
@@ -1309,8 +1309,6 @@
#else
locked = !in_interrupt() && spin_trylock(&pagemap_lru_lock);
#endif
- if (!locked && !in_interrupt())
- goto opps1;
}

/* Get slab management. */

2002-07-25 12:00:44

by Craig Kulesa

[permalink] [raw]
Subject: Re: [PATCH 2/2] move slab pages to the lru, for 2.5.27


On Wed, 24 Jul 2002, Ed Tomlinson wrote:

> This patch fixes the SMP problems for Steve.

Good sleuthing! Glad to hear this seems to solve the bizarre SMP
out-of-memory problems. However, you should know that there *might* still
be demons lurking about.

I still have problems with 2.5.27-rmap-slablru with CONFIG_SMP booting on
a UP laptop, when 2.5.27-rmap (the big rmap patch) works fine in SMP mode.
I have spinlock debugging turned on and get oopses with modprobe trying to
load the rtc module. It fails this test in include/asm/spinlock.h:

#ifdef CONFIG_DEBUG_SPINLOCK
if (lock->magic != SPINLOCK_MAGIC)
BUG();

Modprobe also traps itself in infinite loops trying to load unix.o for
net-pf-1. Eeeks. I'll test on other UP boxes in SMP mode and see if I
can trigger anything.


For now, I've applied Ed's patch and tested that it doesn't cause any
problems for UP behavior, so I added it to the patch queue against 2.5.27
and is included in the rmap patches for 2.5.28, which you can download:

http://loke.as.arizona.edu/~ckulesa/kernel/rmap-vm/2.5.28/

The only new change for 2.5.28 is fixing software suspend to work
with the full rmap patch. I tested swsusp with 2.5.28-rmap-slablru, and
it's very cool. :)

Although I suspect SMP folks will have their hands busy with *other*
things in 2.5.28, (!!) more SMP feedback regarding slab-on-LRU would be
most helpful!

Thanks,
Craig Kulesa