LinuxLists.cc - time tells all about kernel VM's

2001-10-23 03:04:12

Subject: time tells all about kernel VM's

We've all seen benchmarks and "load tests" and "real world runthroughs" on
the rik and aa kernel VM's. But time does tell all. I've had
2.4.12-ac3-hogstop up and running for over 5 days. The first hiccup i found
was a day or so ago when trying out defragging an ext2 fs on a hdd just for
the hell of it. I have 770MB of ram and 128MB of swap (since my other 128MB
of swap was on the drive i was defragging and i had swapoff'd it). First
the kernel created about 600MB of buffer in addition to the application
specified 128MB of buffer i had it using (e2defrag -p 16384). This brought
the system to a crawl. So in some twisted reality that may be considered
normal kernel behavior, so i let it pass. Then i created an insanely large
ps and tried loading it in ghostview, magnified it a couple times in
kghostview and what happens? I wish i could tell you but i cant because the
system immediately went unresponsive and started swapping at a turtles pace.
I can tell what didn't happen though.

A. OOM did not kick in and kill kghostview. Why you may ask? Read on to B.
B. The VM has this need to redistribute cache and buffer so that an OOM
situation doesn't take place until all the ram is basically being used. The
problem is that currently the VM will swap out stuff it isn't using and
without buffer it must read from the drive (which is being used to swap)
which takes more cpu which isn't there because the app is locking the kernel
up trying to allocate memory (see why dbench causes mp3 skips). So what
happens is that the kernel cant swap because the hdd io is being strangled by
the process that's going out of control (kghostview) which means that the VM
is stuck doing this redistribution at a snails pace and the OOM situation
never occurs (or occurs many days later when you've died of starvation).
Leaving you deadlocked on a kernel with a VM that is supposed to conquer this
situation and make it a thing of the past.

So what happens after a few days of uptime is that we see where the VM has
slight weaknesses that magnify over time and aren't aparent on the normal run
of tests done on each new release to decide if it's good or not.
Perhaps if i had not had any swap loaded at all this situation would have
been avoided.
I see this as a pretty serious bug

2001-10-23 05:02:46

by Ed Sweetman

[permalink] [raw]

Subject: Re: time tells all about kernel VM's

On Monday 22 October 2001 23:04, safemode wrote:
> We've all seen benchmarks and "load tests" and "real world runthroughs" on
> the rik and aa kernel VM's. But time does tell all. I've had
> 2.4.12-ac3-hogstop up and running for over 5 days. The first hiccup i
> found was a day or so ago when trying out defragging an ext2 fs on a hdd
> just for the hell of it. I have 770MB of ram and 128MB of swap (since my
> other 128MB of swap was on the drive i was defragging and i had swapoff'd
> it). First the kernel created about 600MB of buffer in addition to the
> application specified 128MB of buffer i had it using (e2defrag -p 16384).
> This brought the system to a crawl. So in some twisted reality that may be
> considered normal kernel behavior, so i let it pass. Then i created an
> insanely large ps and tried loading it in ghostview, magnified it a couple
> times in kghostview and what happens? I wish i could tell you but i cant
> because the system immediately went unresponsive and started swapping at a
> turtles pace. I can tell what didn't happen though.
>
> A. OOM did not kick in and kill kghostview. Why you may ask? Read on to
> B. B. The VM has this need to redistribute cache and buffer so that an OOM
> situation doesn't take place until all the ram is basically being used.
> The problem is that currently the VM will swap out stuff it isn't using and
> without buffer it must read from the drive (which is being used to swap)
> which takes more cpu which isn't there because the app is locking the
> kernel up trying to allocate memory (see why dbench causes mp3 skips). So
> what happens is that the kernel cant swap because the hdd io is being
> strangled by the process that's going out of control (kghostview) which
> means that the VM is stuck doing this redistribution at a snails pace and
> the OOM situation never occurs (or occurs many days later when you've died
> of starvation). Leaving you deadlocked on a kernel with a VM that is
> supposed to conquer this situation and make it a thing of the past.
>
> So what happens after a few days of uptime is that we see where the VM has
> slight weaknesses that magnify over time and aren't aparent on the normal
> run of tests done on each new release to decide if it's good or not.
> Perhaps if i had not had any swap loaded at all this situation would have
> been avoided.
> I see this as a pretty serious bug

I've reproduced this quite a number of times (unfortunately) by running
graphviz and creating huge (9500x11500) postscript files that fill the hdd
(perhaps due to a bug) and basically leave no room for anything, this wreaks
havoc on the VM which has to keep everything in buffer because it cant write
to disk. no error was displayed about running out of disk space. This seems
to be a serious problem for rik's vm (at least his) and i would think would
keep it from being chosen as the standard 2.4vm. This seems to be able to
show a bug in which running out of disk space is never reported, and the vm
deadlocks the kernel by trying to make room for the process which
consequently makes the OOM handler useless.

2001-10-23 07:41:28

by Helge Hafting

[permalink] [raw]

Subject: Re: time tells all about kernel VM's

safemode wrote:
[...]
> B. The VM has this need to redistribute cache and buffer so that an OOM
> situation doesn't take place until all the ram is basically being used. The
> problem is that currently the VM will swap out stuff it isn't using and
> without buffer it must read from the drive (which is being used to swap)
> which takes more cpu which isn't there because the app is locking the kernel
> up trying to allocate memory (see why dbench causes mp3 skips). So what
> happens is that the kernel cant swap because the hdd io is being strangled by
> the process that's going out of control (kghostview) which means that the VM
> is stuck doing this redistribution at a snails pace and the OOM situation
> never occurs (or occurs many days later when you've died of starvation).
> Leaving you deadlocked on a kernel with a VM that is supposed to conquer this
> situation and make it a thing of the past.
>
Any VM with paging _can_ be forced into a trashing situation where
a keypress takes hours to process. A better VM will take more pressure
before it gets there and performance will degrade more gradually.
But any VM can get into this situation.

Consider a malicious app that uses lots of RAM but deliberately leaves
a _single_ page free. OOM will never happen, but the machine is
brought to its knees anyway. (You can also get in trouble by running
a few hundred infinite loops, with some dummy io so they too get the
io boost other processes gets.)

Swapping out whole processes can help this, but it will merely
move the point where you get stuck. A load control system that
kills processes when response is too slow is possible, but
the problem here is that you can't get people to agree
on how bad is too bad. It is sometimes ok to leave the machine
alone crunching a big problem over the weekend. And sometimes
you _need_ response much faster.

And what app to kill in such a situation?
You had a single memory pig, but it aint necessarily so.

Helge Hafting

2001-10-23 11:33:42

by Rik van Riel

[permalink] [raw]

Subject: Re: time tells all about kernel VM's

On Mon, 22 Oct 2001, safemode wrote:

> A. OOM did not kick in and kill kghostview. Why you may ask? Read on to B.
> B. .... So what
> happens is that the kernel cant swap because the hdd io is being strangled by
> the process that's going out of control (kghostview) which means that the VM
> is stuck doing this redistribution at a snails pace and the OOM situation
> never occurs

> I see this as a pretty serious bug

Fully agreed. Fixes are welcome.

Rik
--
DMCA, SSSCA, W3C? Who cares? http://thefreeworld.net/ (volunteers needed)

http://www.surriel.com/ http://distro.conectiva.com/

2001-10-23 19:30:24

by Bill Davidsen

[permalink] [raw]

Subject: Re: time tells all about kernel VM's

In article <[email protected]>,
Helge Hafting <[email protected]> wrote:

| Any VM with paging _can_ be forced into a trashing situation where
| a keypress takes hours to process. A better VM will take more pressure
| before it gets there and performance will degrade more gradually.
| But any VM can get into this situation.

So far I agree, and that implies that the VM needs to identify and
correct the situation.

| Swapping out whole processes can help this, but it will merely
| move the point where you get stuck. A load control system that
| kills processes when response is too slow is possible, but
| the problem here is that you can't get people to agree
| on how bad is too bad. It is sometimes ok to leave the machine
| alone crunching a big problem over the weekend. And sometimes
| you _need_ response much faster.
|
| And what app to kill in such a situation?
| You had a single memory pig, but it aint necessarily so.

I think the problem is not killing the wrong thing, but not killing
anything... We can argue any old factors for selection, but I would
first argue that the real problem is that nothing was killed because the
problem was not noticed.

One possible way to recognize the problem is to identify the ratio of
page faults to time slice used and assume there is trouble in River City
if that gets high and stays high. I leave it to the VM gurus to define
"high," but processes which continually block for page fault as opposed
to i/o of some kind are an indication of problems, and likely to be a
factor in deciding what to kill.

I think it gives a fair indication of getting things done or not, and I
have said before I like per-process page fault rates as a datam to be
included in VM decisions.

--
bill davidsen <[email protected]>
His first management concern is not solving the problem, but covering
his ass. If he lived in the middle ages he'd wear his codpiece backward.

2001-10-23 23:21:58

by Ed Sweetman

[permalink] [raw]

Subject: Re: time tells all about kernel VM's

On Tuesday 23 October 2001 15:30, bill davidsen wrote:
> In article <[email protected]>,
>
> Helge Hafting <[email protected]> wrote:
> | Any VM with paging _can_ be forced into a trashing situation where
> | a keypress takes hours to process. A better VM will take more pressure
> | before it gets there and performance will degrade more gradually.
> | But any VM can get into this situation.
>
> So far I agree, and that implies that the VM needs to identify and
> correct the situation.
The real reason i brought this up is because so much trouble was taken to
implement an OOM handler yet this obviously known and simple situation
totally bypasses it. What situation does the OOM handler even work at? I
would think that if anything actually tried mapping all of free memory,
anything else would error out and you'd just move to one of the terminals
open and kill the process.
The only situation i can think of that the OOM would come into use is leaking
memory but isn't that the same situation that I discribed occuring? What's
different? And how is it that the kernel creating 600MB of buffer was
normal instead of keeping some so my other programs could stay alive? Or
would it have not even mattered since there is a problem rooted in the vm io
subsystem that allows situations like this?
Also, is the idea of preemption in the VM being thought of for 2.5? Seems
like something like that would make this kind of problem mute.

I'm tempted to test out Andrea's vm to see if it locks just as easily.

> | Swapping out whole processes can help this, but it will merely
> | move the point where you get stuck. A load control system that
> | kills processes when response is too slow is possible, but
> | the problem here is that you can't get people to agree
> | on how bad is too bad. It is sometimes ok to leave the machine
> | alone crunching a big problem over the weekend. And sometimes
> | you _need_ response much faster.
> |
> | And what app to kill in such a situation?
> | You had a single memory pig, but it aint necessarily so.
>
> I think the problem is not killing the wrong thing, but not killing
> anything... We can argue any old factors for selection, but I would
> first argue that the real problem is that nothing was killed because the
> problem was not noticed.
>
> One possible way to recognize the problem is to identify the ratio of
> page faults to time slice used and assume there is trouble in River City
> if that gets high and stays high. I leave it to the VM gurus to define
> "high," but processes which continually block for page fault as opposed
> to i/o of some kind are an indication of problems, and likely to be a
> factor in deciding what to kill.
>
> I think it gives a fair indication of getting things done or not, and I
> have said before I like per-process page fault rates as a datam to be
> included in VM decisions.

2001-10-23 23:30:38

by Ed Sweetman

[permalink] [raw]

Subject: Re: time tells all about kernel VM's

Actually, now that i think about it. The reason that the OOM didn't activate
was because the process wasn't what was going out of control. ps showed my
process correctly using the amount of memory i told it to use 128MB. The
kernel, itself, created 600MB of buffer that caused it's own death. The OOM
cant kill the kernel. So this problem has nothing to do with tuning the OOM
to handle locking situations. It sounds a bit deeper into the VM world than
that.

On Tuesday 23 October 2001 19:22, safemode wrote:
> On Tuesday 23 October 2001 15:30, bill davidsen wrote:
> > In article <[email protected]>,
> >
> > Helge Hafting <[email protected]> wrote:
> > | Any VM with paging _can_ be forced into a trashing situation where
> > | a keypress takes hours to process. A better VM will take more pressure
> > | before it gets there and performance will degrade more gradually.
> > | But any VM can get into this situation.
> >
> > So far I agree, and that implies that the VM needs to identify and
> > correct the situation.
>
> The real reason i brought this up is because so much trouble was taken to
> implement an OOM handler yet this obviously known and simple situation
> totally bypasses it. What situation does the OOM handler even work at? I
> would think that if anything actually tried mapping all of free memory,
> anything else would error out and you'd just move to one of the terminals
> open and kill the process.
> The only situation i can think of that the OOM would come into use is
> leaking memory but isn't that the same situation that I discribed occuring?
> What's different? And how is it that the kernel creating 600MB of
> buffer was normal instead of keeping some so my other programs could stay
> alive? Or would it have not even mattered since there is a problem rooted
> in the vm io subsystem that allows situations like this?
> Also, is the idea of preemption in the VM being thought of for 2.5? Seems
> like something like that would make this kind of problem mute.
>
> I'm tempted to test out Andrea's vm to see if it locks just as easily.
>
> > | Swapping out whole processes can help this, but it will merely
> > | move the point where you get stuck. A load control system that
> > | kills processes when response is too slow is possible, but
> > | the problem here is that you can't get people to agree
> > | on how bad is too bad. It is sometimes ok to leave the machine
> > | alone crunching a big problem over the weekend. And sometimes
> > | you _need_ response much faster.
> > |
> > | And what app to kill in such a situation?
> > | You had a single memory pig, but it aint necessarily so.
> >
> > I think the problem is not killing the wrong thing, but not killing
> > anything... We can argue any old factors for selection, but I would
> > first argue that the real problem is that nothing was killed because the
> > problem was not noticed.
> >
> > One possible way to recognize the problem is to identify the ratio of
> > page faults to time slice used and assume there is trouble in River City
> > if that gets high and stays high. I leave it to the VM gurus to define
> > "high," but processes which continually block for page fault as opposed
> > to i/o of some kind are an indication of problems, and likely to be a
> > factor in deciding what to kill.
> >
> > I think it gives a fair indication of getting things done or not, and I
> > have said before I like per-process page fault rates as a datam to be
> > included in VM decisions.

2001-10-23 23:42:08

by Rik van Riel

[permalink] [raw]

Subject: Re: time tells all about kernel VM's

On Mon, 22 Oct 2001, safemode wrote:

> First the kernel created about 600MB of buffer in addition to the
> application specified 128MB of buffer i had it using (e2defrag -p
> 16384). This brought the system to a crawl.

Now that I think about it, and read the last message you wrote
in the thread ... do you have some vmstat output during this
time ?

Do you know if e2defrag somehow locks buffers into RAM ?

Rik
--
DMCA, SSSCA, W3C? Who cares? http://thefreeworld.net/ (volunteers needed)

http://www.surriel.com/ http://distro.conectiva.com/

2001-10-24 02:08:47

by Ed Sweetman

[permalink] [raw]

Subject: Re: time tells all about kernel VM's

On Tuesday 23 October 2001 19:42, Rik van Riel wrote:
> On Mon, 22 Oct 2001, safemode wrote:
> > First the kernel created about 600MB of buffer in addition to the
> > application specified 128MB of buffer i had it using (e2defrag -p
> > 16384). This brought the system to a crawl.
>
> Now that I think about it, and read the last message you wrote
> in the thread ... do you have some vmstat output during this
> time ?
>
> Do you know if e2defrag somehow locks buffers into RAM ?
>
e2defrag has a setting to allocate buffers. According to the number i gave
it, it should have allocated 128MB .. this is in accordance to what i
observed in ps aux during the runtime. All vmstat data i had was in buffer
and lost when later i ran the graphviz programs and deadlocked the computer.
I was not expecting to reboot. I can always try it again. e2defrag didn't
deadlock the computer, but it did cause that unusual behavior that i observed
just before deadlocking it with graphviz. What kind of vmstat output do you
want, every 10 seconds?

2001-10-24 09:07:51

by Helge Hafting

[permalink] [raw]

Subject: Re: time tells all about kernel VM's

bill davidsen wrote:
[...]
> | And what app to kill in such a situation?
> | You had a single memory pig, but it aint necessarily so.
>
> I think the problem is not killing the wrong thing, but not killing
> anything...

The OOM killer never ever kills anything when the machine _isn't_ OOM.
That is not its job. The OOM killer is there fix one particular
crisis: When memory is needed for further processing but none
at all exists. It kills a process when the alternative is
a kernel crash.

The OOM killer is not there to make your machine perform
reasonably, it is not a load control measure.

> We can argue any old factors for selection, but I would
> first argue that the real problem is that nothing was killed because the
> problem was not noticed.

What I am saying is that you need another killer. The machine wasn't
OOM,
so of course the OOM killer didn't notice. It was merely using
its memory in a stupid way, causing extremely bad performance. It isn't
OOM when there's 600M in buffers - all those may be freed.

Fixing this case would be nice. But overload scenarios are
still possible, so what you want is probably an overload killer.

> One possible way to recognize the problem is to identify the ratio of
> page faults to time slice used and assume there is trouble in River City
> if that gets high and stays high. I leave it to the VM gurus to define
> "high," but processes which continually block for page fault as opposed
> to i/o of some kind are an indication of problems, and likely to be a
> factor in deciding what to kill.

Note that it is possible to have a machine that perform excellent
even if one process is trashing to hell (and spending weeks on
a 5-minute task due to trashing.)

How? This is possible if the process isn't allowed to use more
than some reasonable fraction of RAM. It can swap a lot if it
needs more, but other, more reasonable processes will run
at full speed and not swap and get enough cache for file io.
(You definitely want
swap on a separate spindle in this case, or you loose IO
performance for the other processes.)

I believe som os'es, like VMS, can do this.
The problem with this approach is administration. There is no
automatic way to estimate how much RAM is reasonable for a process.

A big simulation with no IO can reasonably use 99% of the memory on
a dedicated machine. But doing that would kill both desktop
and server machines. So administrators would have to set memory
quotas for every process, which is a lot of work.

And you may have to set quotas for every run - so you can't
just stick it in a script. Gcc is one example - I have memory
enough to run several in parallel for a kernel compile,
but I run only one for a big C++ compile.

Helge Hafting

2001-10-24 11:55:30

by Ed Sweetman

[permalink] [raw]

Subject: Re: time tells all about kernel VM's

ok. Reran e2defrag and got the same effect.
This is the vmstat output by the second. It starts out with my normal load
(but no mp3s playing). Then i start e2defrag with the same arguments as
before and allow it to run all the way through. It ends but i dont close it
until near the very end (which is seen by the swap dropoff. Then i let my
normal load again be displayed a bit. One thing i did notice, however, was
that the vm handled that quite a lot better than how it handled it after
being up for 5 days even though it created the 600MB of buffer.

Here are some /proc/meminfo readings

total: used: free: shared: buffers: cached:
Mem: 790016000 784146432 5869568 1929216 506896384 116387840
Swap: 133885952 87826432 46059520
MemTotal: 771500 kB
MemFree: 5732 kB
MemShared: 1884 kB
Buffers: 495016 kB
Cached: 29848 kB
SwapCached: 83812 kB
Active: 312468 kB
Inact_dirty: 298092 kB
Inact_clean: 0 kB
Inact_target: 157272 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 771500 kB
LowFree: 5732 kB
SwapTotal: 130748 kB
SwapFree: 44980 kB

total: used: free: shared: buffers: cached:
Mem: 790016000 782893056 7122944 188416 633905152 13586432
Swap: 133885952 116785152 17100800
MemTotal: 771500 kB
MemFree: 6956 kB
MemShared: 184 kB
Buffers: 619048 kB
Cached: 7920 kB
SwapCached: 5348 kB
Active: 320744 kB
Inact_dirty: 311756 kB
Inact_clean: 0 kB
Inact_target: 157272 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 771500 kB
LowFree: 6956 kB
SwapTotal: 130748 kB
SwapFree: 16700 kB

Attachments:

vmstat_output (66.44 kB)

2001-10-24 18:05:35

by Luigi Genoni

[permalink] [raw]

Subject: Re: time tells all about kernel VM's

On Wed, 24 Oct 2001, safemode wrote:

> ok. Reran e2defrag and got the same effect.
> This is the vmstat output by the second. It starts out with my normal load
> (but no mp3s playing). Then i start e2defrag with the same arguments as
> before and allow it to run all the way through. It ends but i dont close it
> until near the very end (which is seen by the swap dropoff. Then i let my
> normal load again be displayed a bit. One thing i did notice, however, was
> that the vm handled that quite a lot better than how it handled it after
> being up for 5 days even though it created the 600MB of buffer.

If I do remember well e2defrag was working just with ext2 with 1k as block
size, and latest version compiled with 2.0.12 kernel, (I made also a patch
to compile with 2.0.X kernels after), then ext2 simply evolved and
e2defrag did not. (by the way e2defrag sources are really isstructive to
learn how a blockFS works).

I used e2defrag since earlier versions, (just with old slow disk, now it
is almost useless, and I went to journaled FSes). If I do remember well,
the behavoiur you are telling was usual with 2.0 kernels.
If the pool is to big, i saw that e2dump shows a lot of inode that left
their group (sic!), and also there could be some FS corruption.
e2defrag was writter to use buffer cache, and now VM changed in details
this behaviour. It could be that what you see is due to those changes?

>
> Here are some /proc/meminfo readings
>
> total: used: free: shared: buffers: cached:
> Mem: 790016000 784146432 5869568 1929216 506896384 116387840
> Swap: 133885952 87826432 46059520
> MemTotal: 771500 kB
> MemFree: 5732 kB
> MemShared: 1884 kB
> Buffers: 495016 kB
> Cached: 29848 kB
> SwapCached: 83812 kB
> Active: 312468 kB
> Inact_dirty: 298092 kB
> Inact_clean: 0 kB
> Inact_target: 157272 kB
> HighTotal: 0 kB
> HighFree: 0 kB
> LowTotal: 771500 kB
> LowFree: 5732 kB
> SwapTotal: 130748 kB
> SwapFree: 44980 kB
>
> total: used: free: shared: buffers: cached:
> Mem: 790016000 782893056 7122944 188416 633905152 13586432
> Swap: 133885952 116785152 17100800
> MemTotal: 771500 kB
> MemFree: 6956 kB
> MemShared: 184 kB
> Buffers: 619048 kB
> Cached: 7920 kB
> SwapCached: 5348 kB
> Active: 320744 kB
> Inact_dirty: 311756 kB
> Inact_clean: 0 kB
> Inact_target: 157272 kB
> HighTotal: 0 kB
> HighFree: 0 kB
> LowTotal: 771500 kB
> LowFree: 6956 kB
> SwapTotal: 130748 kB
> SwapFree: 16700 kB
>

2001-10-24 18:36:05

by Ed Sweetman

[permalink] [raw]

Subject: Re: time tells all about kernel VM's

On Wednesday 24 October 2001 14:05, Luigi Genoni wrote:
> On Wed, 24 Oct 2001, safemode wrote:
> > ok. Reran e2defrag and got the same effect.
> > This is the vmstat output by the second. It starts out with my normal
> > load (but no mp3s playing). Then i start e2defrag with the same
> > arguments as before and allow it to run all the way through. It ends but
> > i dont close it until near the very end (which is seen by the swap
> > dropoff. Then i let my normal load again be displayed a bit. One thing
> > i did notice, however, was that the vm handled that quite a lot better
> > than how it handled it after being up for 5 days even though it created
> > the 600MB of buffer.
>
> If I do remember well e2defrag was working just with ext2 with 1k as block
> size, and latest version compiled with 2.0.12 kernel, (I made also a patch
> to compile with 2.0.X kernels after), then ext2 simply evolved and
> e2defrag did not. (by the way e2defrag sources are really isstructive to
> learn how a blockFS works).

e2defrag defaults to 4k blocks. Version 0.73pjm1 30 Apr 2001

> I used e2defrag since earlier versions, (just with old slow disk, now it
> is almost useless, and I went to journaled FSes). If I do remember well,
> the behavoiur you are telling was usual with 2.0 kernels.
> If the pool is to big, i saw that e2dump shows a lot of inode that left
> their group (sic!), and also there could be some FS corruption.
> e2defrag was writter to use buffer cache, and now VM changed in details
> this behaviour. It could be that what you see is due to those changes?

you say it is the same behavior as 2.0 yet you say that i could be seeing
this problem due to _changes_ in the vm. So the comparison to 2.0 doesn't
really tell us anything since it has nothing to do with what 2.0 was doing.

2001-10-24 19:57:07

by Mike Fedyk

[permalink] [raw]

Subject: Re: time tells all about kernel VM's

On Wed, Oct 24, 2001 at 07:55:27AM -0400, safemode wrote:
> ok. Reran e2defrag and got the same effect.
> This is the vmstat output by the second. It starts out with my normal load
> (but no mp3s playing). Then i start e2defrag with the same arguments as
> before and allow it to run all the way through. It ends but i dont close it
> until near the very end (which is seen by the swap dropoff. Then i let my
> normal load again be displayed a bit. One thing i did notice, however, was
> that the vm handled that quite a lot better than how it handled it after
> being up for 5 days even though it created the 600MB of buffer.
>

Hmm. I have seen similar behavior with:

file -type f -exec cat '{}' \; > /dev/null

I get a very big buffer cache, and very small page cache.

Kernel:
Now : 20:56:14 running Linux
2.4.12-ac5+acct-entropy+preempt+netdev-ramdom+vm-free-swapcache

Btw, this was on a read only NTFS partition. I can test with ext3 if
needed...

2001-10-24 22:01:08

by Luigi Genoni

[permalink] [raw]

Subject: Re: time tells all about kernel VM's

On Wed, 24 Oct 2001, safemode wrote:

> On Wednesday 24 October 2001 14:05, Luigi Genoni wrote:
> > On Wed, 24 Oct 2001, safemode wrote:
> > > ok. Reran e2defrag and got the same effect.
> > > This is the vmstat output by the second. It starts out with my normal
> > > load (but no mp3s playing). Then i start e2defrag with the same
> > > arguments as before and allow it to run all the way through. It ends but
> > > i dont close it until near the very end (which is seen by the swap
> > > dropoff. Then i let my normal load again be displayed a bit. One thing
> > > i did notice, however, was that the vm handled that quite a lot better
> > > than how it handled it after being up for 5 days even though it created
> > > the 600MB of buffer.
> >
> > If I do remember well e2defrag was working just with ext2 with 1k as block
> > size, and latest version compiled with 2.0.12 kernel, (I made also a patch
> > to compile with 2.0.X kernels after), then ext2 simply evolved and
> > e2defrag did not. (by the way e2defrag sources are really isstructive to
> > learn how a blockFS works).
>
> e2defrag defaults to 4k blocks. Version 0.73pjm1 30 Apr 2001
mmm, a new version. The previous one was some year old.
>
> > I used e2defrag since earlier versions, (just with old slow disk, now it
> > is almost useless, and I went to journaled FSes). If I do remember well,
> > the behavoiur you are telling was usual with 2.0 kernels.
> > If the pool is to big, i saw that e2dump shows a lot of inode that left
> > their group (sic!), and also there could be some FS corruption.
> > e2defrag was writter to use buffer cache, and now VM changed in details
> > this behaviour. It could be that what you see is due to those changes?
>
> you say it is the same behavior as 2.0 yet you say that i could be seeing
> this problem due to _changes_ in the vm. So the comparison to 2.0 doesn't
> really tell us anything since it has nothing to do with what 2.0 was doing.
>

Sorry for bad english, and bad logic.
I said that you could see this behaivour also with 2.0. kernels.
With 2.2 kernels this behaviour disapperared. Now I never tried e2defrag
with 2.4 kernels, since I use reiserFS and JFS now, but my idea
was that probably the different buffer cache management is involved, as it
was with 2.0.

Sorry for bad logic, it is too mutch time that i do not sleep :).

Luigi