2005-02-03 10:29:50

by Terje Fåberg

[permalink] [raw]
Subject: 2.6.10: kswapd spins like crazy


I recently upgraded my desktop from 2.4.28 to
2.6.10. Even under moderate memory pressure kswapd
regularly eats almost all available cpu time
whenever there is a little more IO throughput,
like copying large files. The system is extremely
sluggish during this. The system load goes up to
7.5 or more.

This is a Pentium3-866 with 768MB RAM, 2x1GB
swap partitions, vanilla 2.6.10. The strange
behaviour starts at about 200 MB of swap in use.
2.4.28 masters the same workload without any
problems.

vmstat:
procs -----------memory----------
r b swpd free buff cache
6 1 428012 4868 33236 347184
---swap-- -----io---- --system-- ----cpu----
si so bi bo in cs us sy id wa
10 7 147 120 108 111 19 10 68 3

Is there anything I can do to track this down?

Regards,
Terje


2005-02-03 10:48:29

by Nick Piggin

[permalink] [raw]
Subject: Re: 2.6.10: kswapd spins like crazy

On Thu, 2005-02-03 at 11:29 +0100, Terje Fåberg wrote:
> I recently upgraded my desktop from 2.4.28 to
> 2.6.10. Even under moderate memory pressure kswapd
> regularly eats almost all available cpu time
> whenever there is a little more IO throughput,
> like copying large files. The system is extremely
> sluggish during this. The system load goes up to
> 7.5 or more.
>
> This is a Pentium3-866 with 768MB RAM, 2x1GB
> swap partitions, vanilla 2.6.10. The strange
> behaviour starts at about 200 MB of swap in use.
> 2.4.28 masters the same workload without any
> problems.
>
> vmstat:
> procs -----------memory----------
> r b swpd free buff cache
> 6 1 428012 4868 33236 347184
> ---swap-- -----io---- --system-- ----cpu----
> si so bi bo in cs us sy id wa
> 10 7 147 120 108 111 19 10 68 3
>
> Is there anything I can do to track this down?
>

Can you post about 10 seconds of `vmstat 1` output
while this is happening?

Also:
`cat /proc/vmstat > pre ; sleep 10 ; cat /proc/vmstat > post`
while this is happening, and send the pre and post files.

cat /proc/meminfo also might be helpful.

And compile a kernel with "magic sysrq" support, and get a
couple of Alt+SysRq+M dumps (the output will be in dmesg).

Thanks,
Nick



2005-02-03 11:56:12

by Terje Fåberg

[permalink] [raw]
Subject: Re: 2.6.10: kswapd spins like crazy


galileo:~# vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
2 5 428692 4964 118492 320196 84 72 6560 696 1561 8993 40 60 0 0
5 3 428884 3392 118516 318960 140 368 6832 1172 1517 8563 40 60 0 0
5 3 429908 4888 120548 318092 108 1844 5812 2020 1498 7842 53 47 0 0
4 4 430076 3296 121876 318472 340 184 6900 604 1396 8502 43 57 0 0
5 3 430120 4776 121748 316820 80 440 6780 440 1391 8360 34 66 0 0
4 4 430112 4916 123016 317304 376 68 7056 440 1293 8852 23 77 0 0
4 7 430096 4916 123576 316468 348 60 7324 204 1233 8290 21 79 0 0
5 3 430032 14084 129040 316960 192 0 6664 464 1380 8403 24 76 0 0
4 4 430032 7044 135060 317516 244 0 6424 0 1166 8217 17 83 0 0
5 3 430064 4548 138072 317388 172 216 6364 216 1176 8312 17 83 0 0
2 3 430132 4856 139000 316860 252 156 6656 872 1311 8125 19 81 0 0
^C

galileo:~# cat /proc/meminfo
MemTotal: 646052 kB
MemFree: 3296 kB
Buffers: 156912 kB
Cached: 314876 kB
SwapCached: 47524 kB
Active: 92792 kB
Inactive: 447588 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 646052 kB
LowFree: 3296 kB
SwapTotal: 2101056 kB
SwapFree: 1661600 kB
Dirty: 12088 kB
Writeback: 0 kB
Mapped: 103476 kB
Slab: 30032 kB
CommitLimit: 2424080 kB
Committed_AS: 3125208 kB
PageTables: 8440 kB
VmallocTotal: 384980 kB
VmallocUsed: 7392 kB
VmallocChunk: 377500 kB

galileo:~# cat /proc/vmstat > pre ; sleep 10 ; cat /proc/vmstat > post

galileo:~# cat pre
nr_dirty 61
nr_writeback 138
nr_unstable 0
nr_page_table_pages 2118
nr_mapped 24903
nr_slab 7494
pgpgin 40072965
pgpgout 32683347
pswpin 707678
pswpout 491400
pgalloc_high 0
pgalloc_normal 289749372
pgalloc_dma 5185962
pgfree 294936222
pgactivate 7678427
pgdeactivate 7086934
pgfault 76930918
pgmajfault 422426
pgrefill_high 0
pgrefill_normal 63766162
pgrefill_dma 3133019
pgsteal_high 0
pgsteal_normal 11946755
pgsteal_dma 855413
pgscan_kswapd_high 0
pgscan_kswapd_normal 31430190
pgscan_kswapd_dma 2037500863
pgscan_direct_high 0
pgscan_direct_normal 1083423
pgscan_direct_dma 89251
pginodesteal 0
slabs_scanned 15591040
kswapd_steal 12527148
kswapd_inodesteal 2803439
pageoutrun 3511541
allocstall 6111
pgrotated 719114

galileo:~# cat post
nr_dirty 504
nr_writeback 38
nr_unstable 0
nr_page_table_pages 2093
nr_mapped 25652
nr_slab 7488
pgpgin 40106505
pgpgout 32695255
pswpin 710721
pswpout 491907
pgalloc_high 0
pgalloc_normal 289790611
pgalloc_dma 5185979
pgfree 294977468
pgactivate 7680721
pgdeactivate 7089056
pgfault 76933748
pgmajfault 423145
pgrefill_high 0
pgrefill_normal 63776342
pgrefill_dma 3133311
pgsteal_high 0
pgsteal_normal 11957164
pgsteal_dma 855422
pgscan_kswapd_high 0
pgscan_kswapd_normal 31443126
pgscan_kswapd_dma 2038597486
pgscan_direct_high 0
pgscan_direct_normal 1100385
pgscan_direct_dma 90604
pginodesteal 0
slabs_scanned 15596032
kswapd_steal 12531233
kswapd_inodesteal 2803526
pageoutrun 3511829
allocstall 6272
pgrotated 719582


Attachments:
stat (3.16 kB)
stat

2005-02-03 20:04:17

by Terje Fåberg

[permalink] [raw]
Subject: Re: 2.6.10: kswapd spins like crazy


galileo:~# vmstat 1
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
1 0 286576 4984 8712 352232 0 0 0 88 1015 4288 5 95 0 0
1 0 286576 4984 8712 352232 0 0 0 0 1002 4278 5 95 0 0
1 0 286576 4984 8712 352232 32 0 32 0 1003 4365 7 93 0 0
1 0 286576 4984 8736 352256 0 0 40 0 1010 4296 6 94 0 0
2 1 286696 4856 8756 352328 0 120 920 120 1081 4406 4 96 0 0
1 0 287068 4936 8832 352496 104 448 588 568 1072 4422 7 93 0 0
1 0 287068 4936 8832 352496 0 0 0 0 1002 4275 5 95 0 0
1 0 287068 4936 8832 352496 0 0 0 0 1002 4289 6 94 0 0
1 0 287068 4936 8832 352496 0 0 0 0 1002 4324 6 94 0 0
1 0 287068 4936 8832 352496 0 0 0 0 1002 4285 5 95 0 0
1 0 287068 4936 8832 352496 0 0 0 0 1002 4271 5 95 0 0
1 0 287068 4936 8832 352496 0 0 0 0 1002 4335 6 94 0 0
1 0 287068 4936 8832 352496 0 0 0 0 1002 4297 5 95 0 0

galileo:~# cat /proc/vmstat > pre ; sleep 10 ; cat /proc/vmstat > post

galileo:~# cat pre
nr_dirty 201
nr_writeback 0
nr_unstable 0
nr_page_table_pages 1667
nr_mapped 113889
nr_slab 4289
pgpgin 1653048
pgpgout 532204
pswpin 67956
pswpout 84224
pgalloc_high 0
pgalloc_normal 6255968
pgalloc_dma 91765
pgfree 6350163
pgactivate 381383
pgdeactivate 364613
pgfault 2110239
pgmajfault 36305
pgrefill_high 0
pgrefill_normal 4903463
pgrefill_dma 116195
pgsteal_high 0
pgsteal_normal 366259
pgsteal_dma 17568
pgscan_kswapd_high 0
pgscan_kswapd_normal 2504667
pgscan_kswapd_dma 615532032
pgscan_direct_high 0
pgscan_direct_normal 60489
pgscan_direct_dma 11979
pginodesteal 0
slabs_scanned 510336
kswapd_steal 364044
kswapd_inodesteal 99903
pageoutrun 105762
allocstall 435
pgrotated 77400

galileo:~# cat post
nr_dirty 31
nr_writeback 0
nr_unstable 0
nr_page_table_pages 1667
nr_mapped 113890
nr_slab 4285
pgpgin 1653308
pgpgout 532340
pswpin 67956
pswpout 84224
pgalloc_high 0
pgalloc_normal 6290302
pgalloc_dma 91765
pgfree 6384390
pgactivate 381413
pgdeactivate 364613
pgfault 2110638
pgmajfault 36308
pgrefill_high 0
pgrefill_normal 4903463
pgrefill_dma 116195
pgsteal_high 0
pgsteal_normal 366259
pgsteal_dma 17568
pgscan_kswapd_high 0
pgscan_kswapd_normal 2504667
pgscan_kswapd_dma 649881006
pgscan_direct_high 0
pgscan_direct_normal 60489
pgscan_direct_dma 11979
pginodesteal 0
slabs_scanned 514944
kswapd_steal 364044
kswapd_inodesteal 99903
pageoutrun 111269
allocstall 435
pgrotated 77400

galileo:~# cat /proc/meminfo
MemTotal: 645976 kB
MemFree: 8776 kB
Buffers: 9228 kB
Cached: 350380 kB
SwapCached: 74776 kB
Active: 443452 kB
Inactive: 97500 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 645976 kB
LowFree: 8776 kB
SwapTotal: 2101056 kB
SwapFree: 1812348 kB
Dirty: 56 kB
Writeback: 0 kB
Mapped: 455596 kB
Slab: 17124 kB
CommitLimit: 2424044 kB
Committed_AS: 1216312 kB
PageTables: 6668 kB
VmallocTotal: 384980 kB
VmallocUsed: 16420 kB
VmallocChunk: 367568 kB

galileo:~# uname -a
Linux galileo 2.6.10-4 #7 Thu Feb 3 16:34:30 CET 2005 i686 GNU/Linux

galileo:~# uptime
20:39:55 up 50 min, 2 users, load average: 4.54, 3.05, 2.25

galileo:~# ps aux | grep kswapd
root 105 34.5 0.0 0 0 ? R 19:49 17:27 [kswapd0]
root 8111 0.0 0.0 1548 444 pts/4 S+ 20:39 0:00 grep kswapd

galileo:~# dmesg
[...]
SysRq : Show Memory
Mem-info:
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
HighMem per-cpu: empty

Free pages: 7872kB (0kB HighMem)
Active:48698 inactive:86241 dirty:0 writeback:0 unstable:0 free:1968 slab:4509 mapped:50560 pagetables:1717
DMA free:80kB min:80kB low:100kB high:120kB active:0kB inactive:11716kB present:16384kB pages_scanned:123 all_unreclaimable? no
protections[]: 0 0 0
Normal free:7792kB min:3152kB low:3940kB high:4728kB active:194792kB inactive:333248kB present:638976kB pages_scanned:0all_unreclaimable? no
protections[]: 0 0 0
HighMem free:0kB min:128kB low:160kB high:192kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
protections[]: 0 0 0
DMA: 0*4kB 0*8kB 1*16kB 0*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 80kB
Normal: 590*4kB 153*8kB 59*16kB 4*32kB 1*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 7792kB
HighMem: empty
Swap cache: add 173594, delete 157410, find 30045/43843, race 0+0
Free swap: 1763412kB
163840 pages of RAM
0 pages of HIGHMEM
9692 reserved pages
156561 pages shared
16184 pages swap cached


Attachments:
stat2 (4.83 kB)
stat2

2005-02-04 00:13:20

by Nick Piggin

[permalink] [raw]
Subject: Re: 2.6.10: kswapd spins like crazy

Terje Fåberg wrote:
> Terje Fåberg <[email protected]> skrev:
>
>
>>The kernel is compiling right now, but I cannot
>>reboot this machine until six or seven o'clock
>>tonight (CET). I will report then.
>
>
> Well, well, I rebooted the same kernel, now with
> MAGIC-SYSRQ enabled. At first the kswapd-effect
> wouldn't show up, but now the image is much clearer
> than before. kswapd eats constantly 95% cpu time while
> the system is "idle".
>
> The System is quite sluggish. Switching between
> applications needs ages. After Eclipse has been active
> for a few minutes, I it lasts 45 seconds until enough
> of Mozilla is swapped back in, and Mozilla has redrawn
> its window.
>
> Complete info including SysRq-Meminfo is attached.
>

Thanks very much, this is a good help.

> galileo:~# cat /proc/vmstat > pre ; sleep 10 ; cat /proc/vmstat > post
>
> galileo:~# cat pre
...
> pgscan_kswapd_high 0
> pgscan_kswapd_normal 2504667
> pgscan_kswapd_dma 615532032
...
>
> galileo:~# cat post
...
> pgscan_kswapd_high 0
> pgscan_kswapd_normal 2504667
> pgscan_kswapd_dma 649881006
...

So we can see it is trying to scan the DMA zone.

> galileo:~# dmesg
> [...]
> SysRq : Show Memory
> Mem-info:
> DMA per-cpu:
> cpu 0 hot: low 2, high 6, batch 1
> cpu 0 cold: low 0, high 2, batch 1
> Normal per-cpu:
> cpu 0 hot: low 32, high 96, batch 16
> cpu 0 cold: low 0, high 32, batch 16
> HighMem per-cpu: empty
>
> Free pages: 7872kB (0kB HighMem)
> Active:48698 inactive:86241 dirty:0 writeback:0 unstable:0 free:1968 slab:4509 mapped:50560 pagetables:1717
> DMA free:80kB min:80kB low:100kB high:120kB active:0kB inactive:11716kB present:16384kB pages_scanned:123 all_unreclaimable? no
> protections[]: 0 0 0

This is the reason why: DMA only has 80K free, and kswapd won't stop until either 120K
is free, or all_unreclaimable gets switched on.

Now clearly all_unreclaimable should be getting set if nothing can be reclaimed (although
it is possible that non pagecache allocating and freeing can mess it up, that's unlikely).

Hmm, your DMA zone has no active pages, and pages_scanned (which triggers all_unreclaimable)
is only incremented when scanning the active list. But I wonder, if the pages can't be
freed, why aren't they being put on the active list?

Nick

PS. let's not release 2.6.11 just yet :\

2005-02-04 01:20:31

by Nick Piggin

[permalink] [raw]
Subject: Re: 2.6.10: kswapd spins like crazy




---

linux-2.6-npiggin/mm/vmscan.c | 1 +
1 files changed, 1 insertion(+)

diff -puN mm/vmscan.c~vmscan-minfix mm/vmscan.c
--- linux-2.6/mm/vmscan.c~vmscan-minfix 2005-02-04 11:52:37.000000000 +1100
+++ linux-2.6-npiggin/mm/vmscan.c 2005-02-04 11:53:32.000000000 +1100
@@ -575,6 +575,7 @@ static void shrink_cache(struct zone *zo
nr_taken++;
}
zone->nr_inactive -= nr_taken;
+ zone->pages_scanned += nr_scan;
spin_unlock_irq(&zone->lru_lock);

if (nr_taken == 0)

_


Attachments:
vmscan-minfix.patch (491.00 B)

2005-02-04 01:31:33

by Nick Piggin

[permalink] [raw]
Subject: Re: 2.6.10: kswapd spins like crazy


Andrew Morton wrote:
> Nick Piggin <[email protected]> wrote:
>
>>Oh, attached should be a minimal fix if you would like to try it out.
>>
>>
>>...
>>--- linux-2.6/mm/vmscan.c~vmscan-minfix 2005-02-04 11:52:37.000000000 +1100
>>+++ linux-2.6-npiggin/mm/vmscan.c 2005-02-04 11:53:32.000000000 +1100
>>@@ -575,6 +575,7 @@ static void shrink_cache(struct zone *zo
>> nr_taken++;
>> }
>> zone->nr_inactive -= nr_taken;
>>+ zone->pages_scanned += nr_scan;
>> spin_unlock_irq(&zone->lru_lock);
>>
>> if (nr_taken == 0)
>>
>
>
> Any theories as to why these pages aren't being activated and aren't being
> reclaimed?
>
>

No none yet, which is what we should get to the bottom of. I must be
overlooking something, but the only ways I can see should be due to
transient conditions like page locked or under writeback. laptop_mode?

Terje, what is /proc/sys/vm/laptop_mode set to?

2005-02-04 01:38:56

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.10: kswapd spins like crazy

Nick Piggin <[email protected]> wrote:
>
> Oh, attached should be a minimal fix if you would like to try it out.
>
>
> ...
> --- linux-2.6/mm/vmscan.c~vmscan-minfix 2005-02-04 11:52:37.000000000 +1100
> +++ linux-2.6-npiggin/mm/vmscan.c 2005-02-04 11:53:32.000000000 +1100
> @@ -575,6 +575,7 @@ static void shrink_cache(struct zone *zo
> nr_taken++;
> }
> zone->nr_inactive -= nr_taken;
> + zone->pages_scanned += nr_scan;
> spin_unlock_irq(&zone->lru_lock);
>
> if (nr_taken == 0)
>

Any theories as to why these pages aren't being activated and aren't being
reclaimed?

2005-02-04 10:29:12

by Terje Fåberg

[permalink] [raw]
Subject: Re: 2.6.10: kswapd spins like crazy

Nick Piggin <[email protected]> skrev:

> No none yet, which is what we should get to the
> bottom of. I must be overlooking something, but the
> only ways I can see should be due to transient
> conditions like page locked or under writeback.
> laptop_mode?
>
> Terje, what is /proc/sys/vm/laptop_mode set to?

0. I didn't touch any vm-specific options at all.

I just rebooted with your patch. I can _not_ reproduce
the problem until now. So far so good. But yesterday I
couldn't reproduce it straightaway either.

I'll continue to do the same things I did yesterday
before kswapd started to spin.

Regards,
Terje

2005-02-04 16:17:33

by Norman Weathers

[permalink] [raw]
Subject: RE: 2.6.10: kswapd spins like crazy



We have had a similar problem with all kernels since 2.6.8.1. It has
gotten so bad that we had to drop back to 2.6.7 with some extra patches
to get our systems working. Our situation is a little bit different.

We are using smp Opteron boxes as NFS servers. Under almost any load at
all, kswapd goes nuts, taking up
99 % of the CPU cycles for long periods of time. With 2.6.7, this has
not been noticed as bad (just periods of about 3 - 5 seconds of 10 - 35
% utilized, then off for a few seconds, then back again. Sometimes
kswapd lingers longer as the most aggressive app in top, but with 2.6.7,
the nfsd's are the most prevalent).

Also, we have noticed something else. Our servers have dual Broadcom
gigabit nics (Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet
(rev 03)). We have bonded both NICS back to our core switch, both
running at gigabit speed. Under different loads, we start to get call
traces in dmesg and the syslog. An excerpt follows:


<Jan/06 03:50 pm>Call Trace:<IRQ> <ffffffff80158fa0>{__alloc_pages+816}
<ffffffff8013ffd3>{del_timer+115}
<Jan/06 03:50 pm> <ffffffff80158fe0>{__get_free_pages+16}
<ffffffff8015c886>{kmem_getpages+38}
<Jan/06 03:50 pm> <ffffffff8015d8be>{cache_grow+190}
<ffffffff8015db16>{cache_alloc_refill+422}
<Jan/06 03:50 pm> <ffffffff8015de06>{kmem_cache_alloc+54}
<ffffffff802d5eaf>{dst_alloc+47}
<Jan/06 03:50 pm> <ffffffff802e3d17>{ip_route_input_slow+1639}
<ffffffff803085bb>{udp_rcv+267}
<Jan/06 03:50 pm> <ffffffff802e612e>{ip_rcv+526}
<ffffffff802d297d>{netif_receive_skb+477}
<Jan/06 03:50 pm>
<ffffffffa0120fe8>{:bcm5700:MM_IndicateRxPackets+920}
<Jan/06 03:50 pm> <ffffffffa011c9fe>{:bcm5700:bcm5700_poll+158}
<ffffffff802d2b94>{net_rx_action+132}
<Jan/06 03:50 pm> <ffffffff8013c4b1>{__do_softirq+113}
<ffffffff8013c565>{do_softirq+53}
<Jan/06 03:50 pm> <ffffffff80113baf>{do_IRQ+335}
<ffffffff80111001>{ret_from_intr+0}
<Jan/06 03:50 pm> <EOI> <ffffffff8031e419>{thread_return+41}
<ffffffff8010eb20>{default_idle+0}
<Jan/06 03:50 pm> <ffffffff8010eb44>{default_idle+36}
<ffffffff8010ebdc>{cpu_idle+44}
<Jan/06 03:50 pm> <ffffffff80517885>{start_kernel+453}
<Jan/06 03:50 pm>swapper: page allocation failure. order:0, mode:0x20

<Jan/06 03:50 pm>Call Trace:<IRQ> <ffffffff80158fa0>{__alloc_pages+816}
<ffffffff801158b4>{end_8259A_irq+100}
<Jan/06 03:50 pm> <ffffffff80158fe0>{__get_free_pages+16}
<ffffffff8015c886>{kmem_getpages+38}
<Jan/06 03:50 pm> <ffffffff8015d8be>{cache_grow+190}
<ffffffff8015db16>{cache_alloc_refill+422}
<Jan/06 03:50 pm> <ffffffff8015de06>{kmem_cache_alloc+54}
<ffffffff802d5eaf>{dst_alloc+47}
<Jan/06 03:50 pm> <ffffffff802e3d17>{ip_route_input_slow+1639}
<ffffffff802e612e>{ip_rcv+526}
<Jan/06 03:50 pm> <ffffffff80131b2b>{try_to_wake_up+523}
<ffffffff802d297d>{netif_receive_skb+477}
<Jan/06 03:50 pm>
<ffffffffa0120fe8>{:bcm5700:MM_IndicateRxPackets+920}
<Jan/06 03:50 pm> <ffffffffa011c9fe>{:bcm5700:bcm5700_poll+158}
<ffffffff802d2b94>{net_rx_action+132}
<Jan/06 03:50 pm> <ffffffff8013c4b1>{__do_softirq+113}
<ffffffff8013c565>{do_softirq+53}
<Jan/06 03:50 pm> <ffffffff80113baf>{do_IRQ+335}
<ffffffff80111001>{ret_from_intr+0}
<Jan/06 03:50 pm> <EOI> <ffffffff8031e419>{thread_return+41}
<ffffffff8010eb20>{default_idle+0}
<Jan/06 03:50 pm> <ffffffff8010eb44>{default_idle+36}
<ffffffff8010ebdc>{cpu_idle+44}
<Jan/06 03:50 pm> <ffffffff80517885>{start_kernel+453}
<Jan/06 03:50 pm>swapper: page allocation failure. order:0, mode:0x20

<Jan/06 03:50 pm>Call Trace:<IRQ> <ffffffff80158fa0>{__alloc_pages+816}
<ffffffff801158b4>{end_8259A_irq+100}
<Jan/06 03:50 pm> <ffffffff80158fe0>{__get_free_pages+16}
<ffffffff8015c886>{kmem_getpages+38}
<Jan/06 03:50 pm> <ffffffff8015d8be>{cache_grow+190}
<ffffffff8015db16>{cache_alloc_refill+422}
<Jan/06 03:50 pm> <ffffffff8015de06>{kmem_cache_alloc+54}
<ffffffff802d5eaf>{dst_alloc+47}
<Jan/06 03:50 pm> <ffffffff802e3d17>{ip_route_input_slow+1639}
<ffffffff803085bb>{udp_rcv+267}
<Jan/06 03:50 pm> <ffffffff802e612e>{ip_rcv+526}
<ffffffff802d297d>{netif_receive_skb+477}
<Jan/06 03:50 pm>
<ffffffffa0120fe8>{:bcm5700:MM_IndicateRxPackets+920}
<Jan/06 03:50 pm> <ffffffffa011c9fe>{:bcm5700:bcm5700_poll+158}
<ffffffff802d2b94>{net_rx_action+132}
<Jan/06 03:50 pm> <ffffffff8013c4b1>{__do_softirq+113}
<ffffffff8013c565>{do_softirq+53}
<Jan/06 03:50 pm> <ffffffff80113baf>{do_IRQ+335}
<ffffffff80111001>{ret_from_intr+0}
<Jan/06 03:50 pm> <EOI> <ffffffff8031e419>{thread_return+41}
<ffffffff8010eb20>{default_idle+0}
<Jan/06 03:50 pm> <ffffffff8010eb44>{default_idle+36}
<ffffffff8010ebdc>{cpu_idle+44}
<Jan/06 03:50 pm> <ffffffff80517885>{start_kernel+453}
<Jan/06 03:50 pm>swapper: page allocation failure. order:0, mode:0x20

<Jan/06 03:50 pm>Call Trace:<IRQ> <ffffffff80158fa0>{__alloc_pages+816}
<ffffffff801158b4>{end_8259A_irq+100}
<Jan/06 03:50 pm> <ffffffff80158fe0>{__get_free_pages+16}
<ffffffff8015c886>{kmem_getpages+38}
<Jan/06 03:50 pm> <ffffffff8015d8be>{cache_grow+190}
<ffffffff8015db16>{cache_alloc_refill+422}
<Jan/06 03:50 pm> <ffffffff8015de06>{kmem_cache_alloc+54}
<ffffffff802d5eaf>{dst_alloc+47}
<Jan/06 03:50 pm> <ffffffff802e3d17>{ip_route_input_slow+1639}
<ffffffff802e612e>{ip_rcv+526}
<Jan/06 03:50 pm> <ffffffff802d297d>{netif_receive_skb+477}
<ffffffffa0120fe8>{:bcm5700:MM_IndicateRxPackets+920}
<Jan/06 03:50 pm> <ffffffffa011c9fe>{:bcm5700:bcm5700_poll+158}
<ffffffff802d2b94>{net_rx_action+132}
<Jan/06 03:50 pm> <ffffffff8013c4b1>{__do_softirq+113}
<ffffffff8013c565>{do_softirq+53}
<Jan/06 03:50 pm> <ffffffff80113baf>{do_IRQ+335}
<ffffffff80111001>{ret_from_intr+0}
<Jan/06 03:50 pm> <EOI> <ffffffff8031e419>{thread_return+41}
<ffffffff8010eb20>{default_idle+0}
<Jan/06 03:50 pm> <ffffffff8010eb44>{default_idle+36}
<ffffffff8010ebdc>{cpu_idle+44}
<Jan/06 03:50 pm> <ffffffff80517885>{start_kernel+453}
<Jan/06 03:50 pm>swapper: page allocation failure. order:0, mode:0x20

<Jan/06 03:50 pm>Call Trace:<IRQ> <ffffffff80158fa0>{__alloc_pages+816}
<ffffffff80158fe0>{__get_free_pages+16}
<Jan/06 03:50 pm> <ffffffff8015c886>{kmem_getpages+38}
<ffffffff8015d8be>{cache_grow+190}
<Jan/06 03:50 pm> <ffffffff8015db16>{cache_alloc_refill+422}
<ffffffff8015de06>{kmem_cache_alloc+54}
<Jan/06 03:50 pm> <ffffffff802d5eaf>{dst_alloc+47}
<ffffffff802e3d17>{ip_route_input_slow+1639}
<Jan/06 03:50 pm> <ffffffff802e612e>{ip_rcv+526}
<ffffffff802d297d>{netif_receive_skb+477}

This was just a partial listing from one of our servers. I had read in
several lists that this was not considered fatal. The problem is that
with our setup, it has turned fatal, to the point of locking out the
system remotely, and only a reset from the machine itself able to work
(didn't even honor the sysrq-b combo at the console).

Has anyone else run into this? I can get this kind of error using about
20 clients (100 MB connected) hitting one server (dual gigabit bonded).
With 2.6.8.1 and newer, the errors are reproducible, but I can't exactly
tell when they happen (either write or read). I think I have seen them
happen in both writes and reads. And the kswapd problems happened
during writes and reads both as well.

I can also get the kswapd going crazy with a local set of disk I/O
tests.

Any information needed, please ask. Any help would be appreciated.

Thanks,
Norman Weathers




-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Nick Piggin
Sent: Thursday, February 03, 2005 7:20 PM
To: Andrew Morton
Cc: [email protected]; [email protected]; [email protected]
Subject: Re: 2.6.10: kswapd spins like crazy



Andrew Morton wrote:
> Nick Piggin <[email protected]> wrote:
>
>>Oh, attached should be a minimal fix if you would like to try it out.
>>
>>
>>...
>>--- linux-2.6/mm/vmscan.c~vmscan-minfix 2005-02-04
11:52:37.000000000 +1100
>>+++ linux-2.6-npiggin/mm/vmscan.c 2005-02-04 11:53:32.000000000
+1100
>>@@ -575,6 +575,7 @@ static void shrink_cache(struct zone *zo
>> nr_taken++;
>> }
>> zone->nr_inactive -= nr_taken;
>>+ zone->pages_scanned += nr_scan;
>> spin_unlock_irq(&zone->lru_lock);
>>
>> if (nr_taken == 0)
>>
>
>
> Any theories as to why these pages aren't being activated and aren't
being
> reclaimed?
>
>

No none yet, which is what we should get to the bottom of. I must be
overlooking something, but the only ways I can see should be due to
transient conditions like page locked or under writeback. laptop_mode?

Terje, what is /proc/sys/vm/laptop_mode set to?

2005-02-04 17:39:29

by Terje Fåberg

[permalink] [raw]
Subject: Re: 2.6.10: kswapd spins like crazy

Terje F?berg <[email protected]> skrev:

> I'll continue to do the same things I did yesterday
> before kswapd started to spin.

Looks very good so far. I am unable to reproduce the
bad kswapd behaviour with your patch, Nick.

To double-check I booted into the old kernel an hour
ago and I _could_ reproduce the bad behaviour within a
few minutes.

Looks like your patch fixes it for my workload.

Thanks a lot,
Terje

2005-02-04 23:19:33

by Nick Piggin

[permalink] [raw]
Subject: Re: 2.6.10: kswapd spins like crazy

Terje F?berg wrote:
> Terje F?berg <[email protected]> skrev:
>
>
>>I'll continue to do the same things I did yesterday
>>before kswapd started to spin.
>
>
> Looks very good so far. I am unable to reproduce the
> bad kswapd behaviour with your patch, Nick.
>
> To double-check I booted into the old kernel an hour
> ago and I _could_ reproduce the bad behaviour within a
> few minutes.
>
> Looks like your patch fixes it for my workload.
>

OK that's good to know. At this stage it is only working
around the intermediate symptoms, and we might want a
different fix for 2.6.11...

So hopefully you'll be able to test a patch or two if
you get time.

Thanks,
Nick

2005-02-05 07:12:58

by Terje Fåberg

[permalink] [raw]
Subject: Re: 2.6.10: kswapd spins like crazy

Nick Piggin <[email protected]> skrev:

> OK that's good to know. At this stage it is only
> working around the intermediate symptoms, and we
> might want a different fix for 2.6.11...
>
> So hopefully you'll be able to test a patch or two
> if you get time.

Sure. Just drop me a mail.
I'm glad if I can help.

Regards,
Terje