2002-07-19 20:30:47

by Johannes Erdfelt

[permalink] [raw]
Subject: 2.4.19rc2aa1 VM too aggressive?

I recently upgraded a web server I run to a the 2.4.19rc2aa1 kernel to
see how much better the VM is.

It seems to be better than the older 2.4 kernels used on this machine,
but there seems to be lots of motion in the cache for all of the free
memory that exists:

procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
8 0 1 106764 460152 10688 102432 0 0 76 0 971 620 55 44 1
12 0 3 106764 321564 10712 237828 0 0 16 296 1286 860 34 65 0
2 1 0 106760 529488 10712 42284 4 0 4 0 707 481 28 72 0
7 0 0 106760 529496 10736 42468 0 0 204 0 1105 730 28 17 56
5 0 0 106760 515220 10740 53544 0 0 152 0 1237 929 50 22 29
5 0 0 106760 525364 10740 42680 0 0 32 0 918 611 50 29 21
2 0 1 106756 527248 10772 42672 0 0 4 308 994 692 52 32 17
10 1 1 106744 486112 10776 78788 192 0 300 0 1127 638 75 24 2
3 0 1 106744 517516 10776 49696 0 0 4 0 1005 623 55 45 0
4 0 0 106128 528812 10780 42992 0 0 192 0 644 367 13 22 65
3 0 0 106128 527444 10804 43012 0 0 12 276 561 386 16 14 70
1 0 1 106108 527540 10804 43412 8 0 8 0 1224 794 40 47 13
4 0 0 106108 510192 10804 59356 0 0 4 0 481 322 21 17 61
3 0 0 106076 527712 10812 43476 0 0 64 0 1333 968 46 38 16
3 0 0 106036 502288 10812 67236 0 0 0 0 802 494 46 37 17
5 0 2 106032 476188 10844 91496 0 0 4 316 905 573 54 37 8
16 0 2 106032 355400 10844 203880 0 0 4 0 909 540 51 49 0
10 0 2 106024 340108 10852 221548 0 0 28 0 975 659 36 64 0
0 0 0 106024 528340 10852 43572 0 0 4 0 569 426 17 17 67
0 1 0 106024 531304 10852 43612 0 0 4 0 542 342 9 14 77
16 0 1 106024 452508 10876 111588 0 0 20 308 986 621 49 38 13
procs memory swap io system
cpu
r b w swpd free buff cache si so bi bo in cs us sy id
26 0 1 106024 326064 10884 226384 0 0 324 0 1347 841 52 48 0
21 0 1 106016 241296 10896 308532 4 0 56 0 1154 737 53 46 1
15 6 1 137612 51136 8432 480368 40 1700 412 2044 3657 2514 31 69 0
12 22 0 115336 490416 8216 85448 0 1256 52 1320 381 7918 8 92 0
23 0 1 115260 435232 8252 133128 52 24 1168 292 2158 3848 43 47 10
11 0 1 115228 451688 8252 120776 0 0 76 0 1261 775 46 54 0
9 0 1 115180 454820 8296 118996 36 0 1532 0 1844 1530 52 48 0
11 0 1 115176 445800 8340 124312 128 0 656 412 892 796 48 52 0
9 0 1 114916 463744 8352 111960 12 0 452 0 708 673 23 77 0
6 0 1 114888 465108 8356 110332 24 0 248 0 745 696 33 67 0
12 0 1 114876 513064 8356 63196 0 0 420 0 1113 825 41 59 0
2 1 0 114876 550504 8368 28976 0 0 772 0 1614 1066 51 48 1
2 0 0 114868 558216 8408 21820 0 0 220 432 1453 1269 49 27 24
1 0 0 114868 566768 8412 14456 0 0 288 0 909 674 41 17 43
1 0 0 114864 565844 8428 14640 0 0 108 0 920 744 39 13 49

This is with a 1 second interval. Why is it that most of the time I have
~400MB of memory free (this machine has 1GB of memory). Why does the
cache size vary so wildly?

This machine is busy, as you can see, but it looks like the VM is trying
to be a bit too aggressive here.

Any ideas?

JE


2002-07-19 20:49:29

by David Rees

[permalink] [raw]
Subject: Re: 2.4.19rc2aa1 VM too aggressive?

On Fri, Jul 19, 2002 at 04:33:50PM -0400, Johannes Erdfelt wrote:
> I recently upgraded a web server I run to a the 2.4.19rc2aa1 kernel to
> see how much better the VM is.
>
> It seems to be better than the older 2.4 kernels used on this machine,
> but there seems to be lots of motion in the cache for all of the free
> memory that exists:
>
> procs memory swap io system cpu
> 3 0 0 106036 502288 10812 67236 0 0 0 0 802 494 46 37 17
> 5 0 2 106032 476188 10844 91496 0 0 4 316 905 573 54 37 8
> 16 0 2 106032 355400 10844 203880 0 0 4 0 909 540 51 49 0
> 10 0 2 106024 340108 10852 221548 0 0 28 0 975 659 36 64 0
> 0 0 0 106024 528340 10852 43572 0 0 4 0 569 426 17 17 67
> 0 1 0 106024 531304 10852 43612 0 0 4 0 542 342 9 14 77
>
> This is with a 1 second interval. Why is it that most of the time I have
> ~400MB of memory free (this machine has 1GB of memory). Why does the
> cache size vary so wildly?
>
> This machine is busy, as you can see, but it looks like the VM is trying
> to be a bit too aggressive here.

What type of workload? This looks fairly typicaly of a workload which
writes/deletes large files.

-Dave

2002-07-19 21:00:56

by Johannes Erdfelt

[permalink] [raw]
Subject: Re: 2.4.19rc2aa1 VM too aggressive?

On Fri, Jul 19, 2002, David Rees <[email protected]> wrote:
> On Fri, Jul 19, 2002 at 04:33:50PM -0400, Johannes Erdfelt wrote:
> > I recently upgraded a web server I run to a the 2.4.19rc2aa1 kernel to
> > see how much better the VM is.
> >
> > It seems to be better than the older 2.4 kernels used on this machine,
> > but there seems to be lots of motion in the cache for all of the free
> > memory that exists:
> >
> > procs memory swap io system cpu
> > 3 0 0 106036 502288 10812 67236 0 0 0 0 802 494 46 37 17
> > 5 0 2 106032 476188 10844 91496 0 0 4 316 905 573 54 37 8
> > 16 0 2 106032 355400 10844 203880 0 0 4 0 909 540 51 49 0
> > 10 0 2 106024 340108 10852 221548 0 0 28 0 975 659 36 64 0
> > 0 0 0 106024 528340 10852 43572 0 0 4 0 569 426 17 17 67
> > 0 1 0 106024 531304 10852 43612 0 0 4 0 542 342 9 14 77
> >
> > This is with a 1 second interval. Why is it that most of the time I have
> > ~400MB of memory free (this machine has 1GB of memory). Why does the
> > cache size vary so wildly?
> >
> > This machine is busy, as you can see, but it looks like the VM is trying
> > to be a bit too aggressive here.
>
> What type of workload? This looks fairly typicaly of a workload which
> writes/deletes large files.

Web server. The only writing is for the log files, which is relatively
minimal.

You can see from the io, that writes are relatively infrequent, while
reads happen regularly to fetch various documents from disk.

One thing also, is there is lots of process creation in this example.
For a variety of reasons, PHP programs are forked often from the Apache
server.

The systems running an older kernel (like RedHat's 2.4.9-21) are much
more consistent in their usage of memory. There are no 150MB swings in
cache utiliziation, etc.

What's really odd in the vmstat output is the fact that there is no disk
I/O that follows these wild swings. Where is this cache memory coming
from? Or is the accounting just wrong?

JE

2002-07-19 21:42:20

by Johannes Erdfelt

[permalink] [raw]
Subject: Re: 2.4.19rc2aa1 VM too aggressive?

On Fri, Jul 19, 2002, Mark Hahn <[email protected]> wrote:
> > > > procs memory swap io system cpu
> > > > r b w swpd free buff cache si so bi bo in cs us sy id
> > > > 3 0 0 106036 502288 10812 67236 0 0 0 0 802 494 46 37 17
> > > > 5 0 2 106032 476188 10844 91496 0 0 4 316 905 573 54 37 8
> > > > 16 0 2 106032 355400 10844 203880 0 0 4 0 909 540 51 49 0
> > > > 10 0 2 106024 340108 10852 221548 0 0 28 0 975 659 36 64 0
> > > > 0 0 0 106024 528340 10852 43572 0 0 4 0 569 426 17 17 67
> > > > 0 1 0 106024 531304 10852 43612 0 0 4 0 542 342 9 14 77
> ..
> > What's really odd in the vmstat output is the fact that there is no disk
> > I/O that follows these wild swings. Where is this cache memory coming
> > from? Or is the accounting just wrong?
>
> you're right, the jump up makes no sense. if fork was increasing cached-page
> counters even for cloned pages, that might explain it (and be a bug).

procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
12 0 1 151664 365212 11216 201528 12 0 96 0 747 557 37 62 1
7 0 1 151540 425468 11216 146308 0 0 0 0 904 620 45 55 0
0 0 0 151540 540884 11216 37828 0 0 8 0 593 376 12 32 57
2 0 0 151528 533160 11240 44264 0 0 0 284 511 379 14 20 66
0 0 0 151496 540380 11240 37860 8 0 36 0 555 406 16 11 73
0 0 0 151496 540296 11240 37928 0 0 60 0 438 341 19 45 36
0 0 0 151464 540124 11244 37996 0 0 64 0 408 296 9 2 89
3 0 0 151456 503868 11252 71840 0 0 52 0 630 434 29 32 39
15 0 1 151344 416060 11284 151764 8 0 32 296 854 568 50 47 2
19 0 1 151296 335576 11284 226012 0 0 0 0 830 584 49 51 0
20 0 1 151208 286524 11284 268620 0 0 0 0 980 593 60 40 0
10 0 1 150652 451832 11324 119612 16 0 268 272 4815 3162 39 61 0
13 0 4 149660 475196 11348 93836 28 0 68 292 1178 889 51 39 10
15 0 1 149252 105568 11412 447892 116 0 648 284 5491 3849 40 60 0
6 0 0 149252 536052 11424 39132 0 0 56 0 700 527 15 80 5
5 0 1 149072 487304 11436 84188 8 0 108 0 966 648 47 52 1
3 0 0 148984 485116 11440 87760 32 0 100 0 749 512 39 61 0
0 0 0 148932 536436 11468 39324 0 0 24 304 593 385 19 13 68

It's constantly happening too. Like every couple of minutes.

Andrea, any idea what the cause of these fluctuations are?

JE

2002-07-19 22:29:29

by Austin Gonyou

[permalink] [raw]
Subject: Re: 2.4.19rc2aa1 VM too aggressive?

On Fri, 2002-07-19 at 16:03, Johannes Erdfelt wrote:
> On Fri, Jul 19, 2002, David Rees <[email protected]> wrote:
> > On Fri, Jul 19, 2002 at 04:33:50PM -0400, Johannes Erdfelt wrote:
> > > I recently upgraded a web server I run to a the 2.4.19rc2aa1 kernel to
> > > see how much better the VM is.
> > >
> > > It seems to be better than the older 2.4 kernels used on this machine,
> > > but there seems to be lots of motion in the cache for all of the free
> > > memory that exists:
> > >
> > > procs memory swap io system cpu
> > > 3 0 0 106036 502288 10812 67236 0 0 0 0 802 494 46 37 17
> > > 5 0 2 106032 476188 10844 91496 0 0 4 316 905 573 54 37 8
> > > 16 0 2 106032 355400 10844 203880 0 0 4 0 909 540 51 49 0
> > > 10 0 2 106024 340108 10852 221548 0 0 28 0 975 659 36 64 0
> > > 0 0 0 106024 528340 10852 43572 0 0 4 0 569 426 17 17 67
> > > 0 1 0 106024 531304 10852 43612 0 0 4 0 542 342 9 14 77
...

> Web server. The only writing is for the log files, which is relatively
> minimal.

But IMHO, you are using prefork, and not a threaded model correct?

>
> One thing also, is there is lots of process creation in this example.
> For a variety of reasons, PHP programs are forked often from the Apache
> server.

Also, here, even as a DSO, which I think you may not be running PHP as,
(cgi vs. dso), you will use a bit of memory, on top of apache, every
time the new child is created by apache to handle incoming requests.

> The systems running an older kernel (like RedHat's 2.4.9-21) are much
> more consistent in their usage of memory. There are no 150MB swings in
> cache utiliziation, etc.

Hrrmmm....I'd suggest a 2.4.17 or 2.4.19-rc1-aa2 in that case. I promise
you'll see drastic improvements over that kernel.

> What's really odd in the vmstat output is the fact that there is no disk
> I/O that follows these wild swings. Where is this cache memory coming
> from? Or is the accounting just wrong?

I think the accounting is quite correct. Let's look real quick.

<vmstat>
> > > procs memory swap io system cpu
> > > 3 0 0 106036 502288 10812 67236 0 0 0 0 802 494 46 37 17
> > > 5 0 2 106032 476188 10844 91496 0 0 4 316 905 573 54 37 8
> > > 16 0 2 106032 355400 10844 203880 0 0 4 0 909 540 51 49 0
> > > 10 0 2 106024 340108 10852 221548 0 0 28 0 975 659 36 64 0
> > > 0 0 0 106024 528340 10852 43572 0 0 4 0 569 426 17 17 67
> > > 0 1 0 106024 531304 10852 43612 0 0 4 0 542 342 9 14
</vmstat>

Now let's take a closer look....

<vmstat2>
> > > 16 0 2 106032 355400 10844 203880 0 0 4 0 909 540 51 49 0
> > > 10 0 2 106024 340108 10852 221548 0 0 28 0 975 659 36 64 0
</vmstat2>


Notice you're memory utilization jumps here as your free is given to
cache.

<vmstat3>
> > > 0 0 0 106024 528340 10852 43572 0 0 4 0 569 426 17 17 67
> > > 0 1 0 106024 531304 10852 43612 0 0 4 0 542 342 9 14
</vmstat3>

And then back again, probably on process termination.

At that rate, it's all in-memory shuffling going on, and for preforks,
that very likely is the case.

> JE
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
Austin Gonyou <[email protected]>

2002-07-19 23:01:50

by Johannes Erdfelt

[permalink] [raw]
Subject: Re: 2.4.19rc2aa1 VM too aggressive?

On Fri, Jul 19, 2002, Austin Gonyou <[email protected]> wrote:
> On Fri, 2002-07-19 at 16:03, Johannes Erdfelt wrote:
> > Web server. The only writing is for the log files, which is relatively
> > minimal.
>
> But IMHO, you are using prefork, and not a threaded model correct?

Yes, it's a prefork.

> > One thing also, is there is lots of process creation in this example.
> > For a variety of reasons, PHP programs are forked often from the Apache
> > server.
>
> Also, here, even as a DSO, which I think you may not be running PHP as,
> (cgi vs. dso), you will use a bit of memory, on top of apache, every
> time the new child is created by apache to handle incoming requests.

Use both, but for legacy reasons there's still a signficant amount of
children being forked for the CGI like version (caused by SSI).

The memory size for these children is about 40MB (which is strange in
itself), and a couple per second get executed. However, they are very
quick and typically won't see any in ps, but occassionally 1 or 2 will
be seen.

> > The systems running an older kernel (like RedHat's 2.4.9-21) are much
> > more consistent in their usage of memory. There are no 150MB swings in
> > cache utiliziation, etc.
>
> Hrrmmm....I'd suggest a 2.4.17 or 2.4.19-rc1-aa2 in that case. I promise
> you'll see drastic improvements over that kernel.

2.4.17 wasn't good last time I tried it, but I've have much better results
from Andrea's patches. I'll create 2.4.19-rc1-aa2 kernel and see how
that fares.

> > What's really odd in the vmstat output is the fact that there is no disk
> > I/O that follows these wild swings. Where is this cache memory coming
> > from? Or is the accounting just wrong?
>
> I think the accounting is quite correct. Let's look real quick.

I suspect it's correct as well, but that doesn't mean something else
isn't wrong :)

> <vmstat>
> > > > procs memory swap io system cpu
> > > > 3 0 0 106036 502288 10812 67236 0 0 0 0 802 494 46 37 17
> > > > 5 0 2 106032 476188 10844 91496 0 0 4 316 905 573 54 37 8
> > > > 16 0 2 106032 355400 10844 203880 0 0 4 0 909 540 51 49 0
> > > > 10 0 2 106024 340108 10852 221548 0 0 28 0 975 659 36 64 0
> > > > 0 0 0 106024 528340 10852 43572 0 0 4 0 569 426 17 17 67
> > > > 0 1 0 106024 531304 10852 43612 0 0 4 0 542 342 9 14
> </vmstat>
>
> Now let's take a closer look....
>
> <vmstat2>
> > > > 16 0 2 106032 355400 10844 203880 0 0 4 0 909 540 51 49 0
> > > > 10 0 2 106024 340108 10852 221548 0 0 28 0 975 659 36 64 0
> </vmstat2>
>
> Notice you're memory utilization jumps here as your free is given to
> cache.

Are you saying that the cache value is the amount of memory available to
be used by the cache, or actually used by the cache?

It was my understanding that it's the memory actually used by the cache.
If that's the case, I don't understand where the data to fill the cache
is coming from with these blips.

> <vmstat3>
> > > > 0 0 0 106024 528340 10852 43572 0 0 4 0 569 426 17 17 67
> > > > 0 1 0 106024 531304 10852 43612 0 0 4 0 542 342 9 14
> </vmstat3>
>
> And then back again, probably on process termination.

There are couple per second of those processes, so I would expect this
to happen all of the time or atleast much more often.

> At that rate, it's all in-memory shuffling going on, and for preforks,
> that very likely is the case.

One thing to note as well is a significant amount of system time spent
during these situations as well. It looks like a lot of time is spent
managing something.

It's obvious the workload is inefficient, but it's constantly
inefficient which is why these blips are strange.

JE

2002-07-19 23:24:07

by jjs

[permalink] [raw]
Subject: Re: 2.4.19rc2aa1 VM too aggressive?

I've seen several mentions of 2.4.19-rc1aa2 -

FYI, 2.4.19-rc2aa1 came out a few days ago -

I've been running it and it seems to perform
even better under pressure than 2.4.19-rc1aa2,
at least in my workloads...

Joe

Johannes Erdfelt wrote:

>On Fri, Jul 19, 2002, Austin Gonyou <[email protected]> wrote:
>
>
>>On Fri, 2002-07-19 at 16:03, Johannes Erdfelt wrote:
>>
>>
>>>Web server. The only writing is for the log files, which is relatively
>>>minimal.
>>>
>>>
>>But IMHO, you are using prefork, and not a threaded model correct?
>>
>>
>
>Yes, it's a prefork.
>
>
>
>>>One thing also, is there is lots of process creation in this example.
>>>For a variety of reasons, PHP programs are forked often from the Apache
>>>server.
>>>
>>>
>>Also, here, even as a DSO, which I think you may not be running PHP as,
>>(cgi vs. dso), you will use a bit of memory, on top of apache, every
>>time the new child is created by apache to handle incoming requests.
>>
>>
>
>Use both, but for legacy reasons there's still a signficant amount of
>children being forked for the CGI like version (caused by SSI).
>
>The memory size for these children is about 40MB (which is strange in
>itself), and a couple per second get executed. However, they are very
>quick and typically won't see any in ps, but occassionally 1 or 2 will
>be seen.
>
>
>
>>>The systems running an older kernel (like RedHat's 2.4.9-21) are much
>>>more consistent in their usage of memory. There are no 150MB swings in
>>>cache utiliziation, etc.
>>>
>>>
>>Hrrmmm....I'd suggest a 2.4.17 or 2.4.19-rc1-aa2 in that case. I promise
>>you'll see drastic improvements over that kernel.
>>
>>
>
>2.4.17 wasn't good last time I tried it, but I've have much better results
>from Andrea's patches. I'll create 2.4.19-rc1-aa2 kernel and see how
>that fares.
>
>
>
>>>What's really odd in the vmstat output is the fact that there is no disk
>>>I/O that follows these wild swings. Where is this cache memory coming
>>>from? Or is the accounting just wrong?
>>>
>>>
>>I think the accounting is quite correct. Let's look real quick.
>>
>>
>
>I suspect it's correct as well, but that doesn't mean something else
>isn't wrong :)
>
>
>
>><vmstat>
>>
>>
>>>>> procs memory swap io system cpu
>>>>> 3 0 0 106036 502288 10812 67236 0 0 0 0 802 494 46 37 17
>>>>> 5 0 2 106032 476188 10844 91496 0 0 4 316 905 573 54 37 8
>>>>>16 0 2 106032 355400 10844 203880 0 0 4 0 909 540 51 49 0
>>>>>10 0 2 106024 340108 10852 221548 0 0 28 0 975 659 36 64 0
>>>>> 0 0 0 106024 528340 10852 43572 0 0 4 0 569 426 17 17 67
>>>>> 0 1 0 106024 531304 10852 43612 0 0 4 0 542 342 9 14
>>>>>
>>>>>
>></vmstat>
>>
>>Now let's take a closer look....
>>
>><vmstat2>
>>
>>
>>>>>16 0 2 106032 355400 10844 203880 0 0 4 0 909 540 51 49 0
>>>>>10 0 2 106024 340108 10852 221548 0 0 28 0 975 659 36 64 0
>>>>>
>>>>>
>></vmstat2>
>>
>>Notice you're memory utilization jumps here as your free is given to
>>cache.
>>
>>
>
>Are you saying that the cache value is the amount of memory available to
>be used by the cache, or actually used by the cache?
>
>It was my understanding that it's the memory actually used by the cache.
>If that's the case, I don't understand where the data to fill the cache
>is coming from with these blips.
>
>
>
>><vmstat3>
>>
>>
>>>>> 0 0 0 106024 528340 10852 43572 0 0 4 0 569 426 17 17 67
>>>>> 0 1 0 106024 531304 10852 43612 0 0 4 0 542 342 9 14
>>>>>
>>>>>
>></vmstat3>
>>
>>And then back again, probably on process termination.
>>
>>
>
>There are couple per second of those processes, so I would expect this
>to happen all of the time or atleast much more often.
>
>
>
>>At that rate, it's all in-memory shuffling going on, and for preforks,
>>that very likely is the case.
>>
>>
>
>One thing to note as well is a significant amount of system time spent
>during these situations as well. It looks like a lot of time is spent
>managing something.
>
>It's obvious the workload is inefficient, but it's constantly
>inefficient which is why these blips are strange.
>
>JE
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>
>
>

2002-07-20 00:04:26

by Rik van Riel

[permalink] [raw]
Subject: Re: 2.4.19rc2aa1 VM too aggressive?

On 19 Jul 2002, Austin Gonyou wrote:

> Notice you're memory utilization jumps here as your free is given to
> cache.

Swinging back and forth 150 MB per second seems a bit excessive
for that, especially considering that the previously cached
memory seems to end up on the free list and the fact that there
is between 350 and 500 MB free memory.

regards,

Rik
--
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/ http://distro.conectiva.com/


2002-07-20 02:09:13

by Austin Gonyou

[permalink] [raw]
Subject: Re: 2.4.19rc2aa1 VM too aggressive?

On Fri, 2002-07-19 at 18:04, Johannes Erdfelt wrote:
> On Fri, Jul 19, 2002, Austin Gonyou <[email protected]> wrote:
> > On Fri, 2002-07-19 at 16:03, Johannes Erdfelt wrote:
> > > Web server. The only writing is for the log files, which is relatively
> > > minimal.
> >
> > But IMHO, you are using prefork, and not a threaded model correct?
>
> Yes, it's a prefork.
OK.

> > Also, here, even as a DSO, which I think you may not be running PHP as,
> > (cgi vs. dso), you will use a bit of memory, on top of apache, every
> > time the new child is created by apache to handle incoming requests.
>
> Use both, but for legacy reasons there's still a signficant amount of
> children being forked for the CGI like version (caused by SSI).

Right I understand this fully.


> The memory size for these children is about 40MB (which is strange in
> itself), and a couple per second get executed. However, they are very
> quick and typically won't see any in ps, but occassionally 1 or 2 will
> be seen.

It "is" a little odd, but that's correct for apache. We see about the
same here.

> > > The systems running an older kernel (like RedHat's 2.4.9-21) are much
> > > more consistent in their usage of memory. There are no 150MB swings in
> > > cache utiliziation, etc.
> >
> > Hrrmmm....I'd suggest a 2.4.17 or 2.4.19-rc1-aa2 in that case. I promise
> > you'll see drastic improvements over that kernel.
>
> 2.4.17 wasn't good last time I tried it, but I've have much better results
> from Andrea's patches. I'll create 2.4.19-rc1-aa2 kernel and see how
> that fares.

Only reason I suggested that was because 2.4.17+aa patches has been good
to me. I'm just stuck on aa I guess...but for most applications I've run
and all our hardware, Dell and otherwise, it works the best so far.

> > > What's really odd in the vmstat output is the fact that there is no disk
...
> > > from? Or is the accounting just wrong?
> >
> > I think the accounting is quite correct. Let's look real quick.
>
> I suspect it's correct as well, but that doesn't mean something else
> isn't wrong :)
>
Right....it very well could.
...
> >
> > Notice you're memory utilization jumps here as your free is given to
> > cache.
>
> Are you saying that the cache value is the amount of memory available to
> be used by the cache, or actually used by the cache?
>
> It was my understanding that it's the memory actually used by the cache.
> If that's the case, I don't understand where the data to fill the cache
> is coming from with these blips.

I think what's happening here, possibly a vmstat thing or not, I'm not
sure, is that the memory is allocated as part of the cost for starting
the new processes, and then while the processes is "running" that memory
is considered cache, then is freed once the process goes away because
Apache deallocates it.(I'm guessing, but that seems to be the order for
what I know of the apache httpd spawn process)


> > <vmstat3>
> > > > > 0 0 0 106024 528340 10852 43572 0 0 4 0 569 426 17 17 67
> > > > > 0 1 0 106024 531304 10852 43612 0 0 4 0 542 342 9 14
> > </vmstat3>
> >
> > And then back again, probably on process termination.
>
> There are couple per second of those processes, so I would expect this
> to happen all of the time or atleast much more often.
>
> > At that rate, it's all in-memory shuffling going on, and for preforks,
> > that very likely is the case.
>
> One thing to note as well is a significant amount of system time spent
> during these situations as well. It looks like a lot of time is spent
> managing something.
>
> It's obvious the workload is inefficient, but it's constantly
> inefficient which is why these blips are strange.
>

Ahh..I know what you're referring to. If you look you'll see that it's
in system. Hrrmm..we see that here too, but in specific network
topologies. Ours, "i think" we can trace to the logs which we write, but
some systems do it and others don't. Our production systems, which run
the same kernels as our test boxen, don't ever see the behaviour you're
seeing. That is odd.

Well one thing I can offer is that our processes during the periods of
high system usage are usually just a lot, and hadn't flushed to the disk
yet(the logs) but there's lots of memory in cache during that time until
everything gets cleared out. Maybe this is something similar? Just
curious.

> JE
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
Austin Gonyou <[email protected]>

2002-07-23 19:44:32

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: 2.4.19rc2aa1 VM too aggressive?

On Fri, Jul 19, 2002 at 05:45:21PM -0400, Johannes Erdfelt wrote:
> On Fri, Jul 19, 2002, Mark Hahn <[email protected]> wrote:
> > > > > procs memory swap io system cpu
> > > > > r b w swpd free buff cache si so bi bo in cs us sy id
> > > > > 3 0 0 106036 502288 10812 67236 0 0 0 0 802 494 46 37 17
> > > > > 5 0 2 106032 476188 10844 91496 0 0 4 316 905 573 54 37 8
> > > > > 16 0 2 106032 355400 10844 203880 0 0 4 0 909 540 51 49 0
> > > > > 10 0 2 106024 340108 10852 221548 0 0 28 0 975 659 36 64 0
> > > > > 0 0 0 106024 528340 10852 43572 0 0 4 0 569 426 17 17 67
> > > > > 0 1 0 106024 531304 10852 43612 0 0 4 0 542 342 9 14 77
> > ..
> > > What's really odd in the vmstat output is the fact that there is no disk
> > > I/O that follows these wild swings. Where is this cache memory coming
> > > from? Or is the accounting just wrong?
> >
> > you're right, the jump up makes no sense. if fork was increasing cached-page
> > counters even for cloned pages, that might explain it (and be a bug).
>
> procs memory swap io system cpu
> r b w swpd free buff cache si so bi bo in cs us sy id
> 12 0 1 151664 365212 11216 201528 12 0 96 0 747 557 37 62 1
> 7 0 1 151540 425468 11216 146308 0 0 0 0 904 620 45 55 0
> 0 0 0 151540 540884 11216 37828 0 0 8 0 593 376 12 32 57
> 2 0 0 151528 533160 11240 44264 0 0 0 284 511 379 14 20 66
> 0 0 0 151496 540380 11240 37860 8 0 36 0 555 406 16 11 73
> 0 0 0 151496 540296 11240 37928 0 0 60 0 438 341 19 45 36
> 0 0 0 151464 540124 11244 37996 0 0 64 0 408 296 9 2 89
> 3 0 0 151456 503868 11252 71840 0 0 52 0 630 434 29 32 39
> 15 0 1 151344 416060 11284 151764 8 0 32 296 854 568 50 47 2
> 19 0 1 151296 335576 11284 226012 0 0 0 0 830 584 49 51 0
> 20 0 1 151208 286524 11284 268620 0 0 0 0 980 593 60 40 0
> 10 0 1 150652 451832 11324 119612 16 0 268 272 4815 3162 39 61 0
> 13 0 4 149660 475196 11348 93836 28 0 68 292 1178 889 51 39 10
> 15 0 1 149252 105568 11412 447892 116 0 648 284 5491 3849 40 60 0
> 6 0 0 149252 536052 11424 39132 0 0 56 0 700 527 15 80 5
> 5 0 1 149072 487304 11436 84188 8 0 108 0 966 648 47 52 1
> 3 0 0 148984 485116 11440 87760 32 0 100 0 749 512 39 61 0
> 0 0 0 148932 536436 11468 39324 0 0 24 304 593 385 19 13 68
>
> It's constantly happening too. Like every couple of minutes.
>
> Andrea, any idea what the cause of these fluctuations are?

shared memory? it looks all right. Please try to monitor the shm usage
as root with ipcs and ls -l /dev/shm.

Andrea

2002-07-23 20:20:12

by Stephen Hemminger

[permalink] [raw]
Subject: Re: 2.4.19rc2aa1 VM too aggressive?

When running sequential write performance tests on 2.4.19rc3ac3 I see a
similar problem. What happens is the page cache gets really big, then
the machine starts swapping and becomes unusable.

It seems like there needs to be some upper bound on the page cache or
flow control on file writes to not allow the cpu to get ahead of the
disk.

This is vmstat output when running iozone, and it does first .5 G file
then a 1G file. The machine has lots of memory but when it fills, it
goes off the deep end...

procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
1 0 1 0 234456 12128 542360 0 0 0 32475 1044 29 0 11 89
1 0 1 0 233744 12160 542360 0 0 0 62712 1065 79 0 18 82
0 1 0 0 303008 12160 542360 0 0 0 21751 1042 21 0 3 97
1 0 1 0 510140 12384 201112 0 0 0 11070 1042 18 0 14 86
0 1 0 0 302280 12660 542360 0 0 0 77826 1034 54 0 18 82
0 1 0 0 302280 12660 542360 0 0 0 0 1040 8 0 1 99
1 0 1 0 176144 12660 542360 0 0 0 39073 1043 23 0 12 88
0 1 0 0 302272 12660 542360 0 0 0 81925 1050 60 0 20 80
0 1 0 0 302268 12660 542360 0 0 0 0 1044 6 0 0 100
1 0 1 0 480496 12852 176408 0 0 0 10661 1043 18 0 14 86
0 1 0 0 301456 13156 542360 0 0 0 81511 1035 37 0 19 81
0 1 0 0 301452 13156 542360 0 0 0 0 1040 7 0 0 100
1 0 1 0 181724 13156 542360 0 0 0 30479 1047 18 0 10 90
0 1 0 0 301408 11864 543684 0 0 1 87331 1077 61 0 23 77
0 1 0 0 301400 11872 543684 0 0 0 2 1043 8 0 0 100
0 0 1 0 529624 11880 313604 0 0 0 26639 1043 1843 0 9 91
0 0 2 0 137892 11880 700356 0 0 0 92110 1036 2424 0 10 90
0 0 1 0 54688 11888 780868 0 0 0 29978 1038 515 0 2 98
0 1 1 7900 7024 976 848600 747 1926 1277 22411 1242 730 0 8 92
0 1 1 7900 7036 596 849352 1412 305 2351 738 1421 447 0 4 96
0 2 1 7900 7040 596 849488 1453 162 3077 174 1799 821 0 14 86
0 2 1 7900 7032 596 849772 888 150 1634 168 4111 1165 0 39 61

2002-07-23 20:29:36

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: 2.4.19rc2aa1 VM too aggressive?

On Tue, Jul 23, 2002 at 01:22:35PM -0700, Stephen Hemminger wrote:
> When running sequential write performance tests on 2.4.19rc3ac3 I see a
> similar problem. What happens is the page cache gets really big, then
> the machine starts swapping and becomes unusable.
>
> It seems like there needs to be some upper bound on the page cache or
> flow control on file writes to not allow the cpu to get ahead of the
> disk.

that's the write throttling.

>
> This is vmstat output when running iozone, and it does first .5 G file
> then a 1G file. The machine has lots of memory but when it fills, it
> goes off the deep end...
>
> procs memory swap io system cpu
> r b w swpd free buff cache si so bi bo in cs us sy id
> 1 0 1 0 234456 12128 542360 0 0 0 32475 1044 29 0 11 89
> 1 0 1 0 233744 12160 542360 0 0 0 62712 1065 79 0 18 82
> 0 1 0 0 303008 12160 542360 0 0 0 21751 1042 21 0 3 97
> 1 0 1 0 510140 12384 201112 0 0 0 11070 1042 18 0 14 86
> 0 1 0 0 302280 12660 542360 0 0 0 77826 1034 54 0 18 82
> 0 1 0 0 302280 12660 542360 0 0 0 0 1040 8 0 1 99
> 1 0 1 0 176144 12660 542360 0 0 0 39073 1043 23 0 12 88
> 0 1 0 0 302272 12660 542360 0 0 0 81925 1050 60 0 20 80
> 0 1 0 0 302268 12660 542360 0 0 0 0 1044 6 0 0 100
> 1 0 1 0 480496 12852 176408 0 0 0 10661 1043 18 0 14 86
> 0 1 0 0 301456 13156 542360 0 0 0 81511 1035 37 0 19 81
> 0 1 0 0 301452 13156 542360 0 0 0 0 1040 7 0 0 100
> 1 0 1 0 181724 13156 542360 0 0 0 30479 1047 18 0 10 90
> 0 1 0 0 301408 11864 543684 0 0 1 87331 1077 61 0 23 77
> 0 1 0 0 301400 11872 543684 0 0 0 2 1043 8 0 0 100
> 0 0 1 0 529624 11880 313604 0 0 0 26639 1043 1843 0 9 91
> 0 0 2 0 137892 11880 700356 0 0 0 92110 1036 2424 0 10 90
> 0 0 1 0 54688 11888 780868 0 0 0 29978 1038 515 0 2 98
> 0 1 1 7900 7024 976 848600 747 1926 1277 22411 1242 730 0 8 92
> 0 1 1 7900 7036 596 849352 1412 305 2351 738 1421 447 0 4 96
> 0 2 1 7900 7040 596 849488 1453 162 3077 174 1799 821 0 14 86
> 0 2 1 7900 7032 596 849772 888 150 1634 168 4111 1165 0 39 61

some seldom swapout is ok, the strange thing are those small
swapins/swapouts. I also assume it's writing using write(2), not with
map_shared+msync.

can you try:

echo 1000 >/proc/sys/vm/vm_mapped_ratio

I also wonder if you've quite some amount of mapped address space durign
the benchmark. In such case there's no trivial way around it, the vm
will constantly found tons of mapped address space, and it will trigger
some swapouts, however the swapins shouldn't happen so fast in such
case.

In any case the sysctl will allow you to tune for your workload.

Andrea

2002-07-23 21:31:55

by Stephen Hemminger

[permalink] [raw]
Subject: Re: 2.4.19rc2aa1 VM too aggressive?

On Tue, 2002-07-23 at 13:33, Andrea Arcangeli wrote:

> some seldom swapout is ok, the strange thing are those small
> swapins/swapouts. I also assume it's writing using write(2), not with
> map_shared+msync.

I am using ben's lahaise new AIO which effectively maps the pages in
before the i/o. Using normal I/O I don't see swapping, the cached peaks
at about .827028

>
> can you try:
>
> echo 1000 >/proc/sys/vm/vm_mapped_ratio

That file does not exist in 2.4.19rc3ac3
bash-2.05$ ls /proc/sys/vm
bdflush max_map_count min-readahead page-cluster
kswapd max-readahead overcommit_memory pagetable_cache
>
> I also wonder if you've quite some amount of mapped address space durign
> the benchmark. In such case there's no trivial way around it, the vm
> will constantly found tons of mapped address space, and it will trigger
> some swapouts, however the swapins shouldn't happen so fast in such
> case.
The AIO will pin some space, but the upper bound should be
NIO(16) * Record Size(64k) = 1 Meg


> In any case the sysctl will allow you to tune for your workload.
>
> Andrea


2002-07-23 22:37:24

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: 2.4.19rc2aa1 VM too aggressive?

On Tue, Jul 23, 2002 at 02:34:46PM -0700, Stephen Hemminger wrote:
> On Tue, 2002-07-23 at 13:33, Andrea Arcangeli wrote:
>
> > some seldom swapout is ok, the strange thing are those small
> > swapins/swapouts. I also assume it's writing using write(2), not with
> > map_shared+msync.
>
> I am using ben's lahaise new AIO which effectively maps the pages in
> before the i/o. Using normal I/O I don't see swapping, the cached peaks
> at about .827028

sorry I thought you were using 2.4.19rc3aa1, -ac reintroduces a number
of vm bugs with the rmap vm that I fixed some age ago, plus it
underperformns in many areas, and about async-io I'm not shipping it.
You should report this to Alan and Ben. I'm interested only about
problems that can be reproduced with mainline and -aa, thanks.

> > can you try:
> >
> > echo 1000 >/proc/sys/vm/vm_mapped_ratio
>
> That file does not exist in 2.4.19rc3ac3

yes I misunderstood the kernel version.

> bash-2.05$ ls /proc/sys/vm
> bdflush max_map_count min-readahead page-cluster
> kswapd max-readahead overcommit_memory pagetable_cache
> >
> > I also wonder if you've quite some amount of mapped address space durign
> > the benchmark. In such case there's no trivial way around it, the vm
> > will constantly found tons of mapped address space, and it will trigger
> > some swapouts, however the swapins shouldn't happen so fast in such
> > case.
> The AIO will pin some space, but the upper bound should be
> NIO(16) * Record Size(64k) = 1 Meg
>
>
> > In any case the sysctl will allow you to tune for your workload.
> >
> > Andrea
>


Andrea

2002-07-23 23:49:04

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: 2.4.19rc2aa1 VM too aggressive?

On Fri, Jul 19, 2002 at 09:07:07PM -0300, Rik van Riel wrote:
> On 19 Jul 2002, Austin Gonyou wrote:
>
> > Notice you're memory utilization jumps here as your free is given to
> > cache.
>
> Swinging back and forth 150 MB per second seems a bit excessive
> for that, especially considering that the previously cached
> memory seems to end up on the free list and the fact that there
> is between 350 and 500 MB free memory.

if the app allocates and frees 150MB of shm per second that's what the
kernel has to show you.

Andrea

2002-07-24 00:19:28

by Rik van Riel

[permalink] [raw]
Subject: Re: 2.4.19rc2aa1 VM too aggressive?

On Wed, 24 Jul 2002, Andrea Arcangeli wrote:
> On Fri, Jul 19, 2002 at 09:07:07PM -0300, Rik van Riel wrote:
> > On 19 Jul 2002, Austin Gonyou wrote:
> >
> > > Notice you're memory utilization jumps here as your free is given to
> > > cache.
> >
> > Swinging back and forth 150 MB per second seems a bit excessive
> > for that, especially considering that the previously cached
> > memory seems to end up on the free list and the fact that there
> > is between 350 and 500 MB free memory.
>
> if the app allocates and frees 150MB of shm per second that's what the
> kernel has to show you.

Indeed, though I have to comment that that's rather interesting
web server software ;)

Rik
--
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/ http://distro.conectiva.com/

2002-07-24 04:47:27

by Austin Gonyou

[permalink] [raw]
Subject: Re: 2.4.19rc2aa1 VM too aggressive?

On Tue, 2002-07-23 at 19:21, Rik van Riel wrote:
...
> > if the app allocates and frees 150MB of shm per second that's what the
> > kernel has to show you.
>
> Indeed, though I have to comment that that's rather interesting
> web server software ;)

I agree, but I'm not sure what else could be causing it, unless vmstat
is not calculating values correctly, like off by 1 values or something.
I take a look at my local box, not the same has his test environment,
but still, 512MB and Apache. I'll see what I can see, maybe we can shed
*some* light on this. :)

> Rik

--
Austin Gonyou <[email protected]>