Issue:
Is it possible to set the disk cache size to a higher value to avoid
temporary freezing while untarring large files? Memory is not an
issue, I have plenty of it. The disk drive is a good drive, does
29.2MB/s sustained in single user mode, 25MB/s when I have a lot of
processes open. Here is what I think is going on. Sometimes, when I
untar things, or do things that consume a lot of disk space rapidly,
they do it VERY quickly, and then the disk rumbles on for 5-20
seconds after it is done. What accounts for this?
Example:
[20:10]% tar -xf 2GB-FILE.tar
[20:30]% # Hard disk is still grinding.
[20:60]% # Hard disk stops grinding.
In essence, the 'tar' command is finished, however, 30-60 seconds
after it has finished, it is actually still decompressing the data to
the file on the disk.
I have not tested ALL kernels for when or where this has started, but
could someone provide a further explanation as to why the disk
scheduluer works like this?
On Solaris, when I untar a file, the disk stops grinding when the tar
process is finished, and the system is totally usuable.
With Linux, when I untar the file, the system may completely lock up
for 3-5 seconds during the duration after the initial untar (which is
30-60 seconds) after the untar processes has ended.
System Setup: P3/866
1GB RAM
2GB SWAP
Kernel 2.4.16
Result:
Just read this [bottom] after trying to burn 2 CD's (luckily on
CDRW's) at the same time on two different IDE bus controllers while
untarring a 1.6GB file. With earlier kernels, this is usually not a
problem.
CDRW1 = Plextor v1.09
CDRW2 = HP 7510i
Burnproof kicked in for the Plextor, I love Plextor drives.
With the HP, it didn't have enough data to fill the buffer, and
therefore caused a buffer underrun, easy to blank=toc and re-write
however.
http://lwn.net/2001/1129/kernel.php3
The current stable kernel release is 2.4.16. This release, the first
by Marcelo Tosatti, contains little beyond the filesystem fix. This
release does seem to deserve the name "stable," though there are
still some persistent complaints about interactive response in the
presence of heavy I/O. The culprit appears to be the disk I/O
scheduler; a real fix for that problem could be long in coming. The
2.4.17-pre1 prepatch contains a number of items including a new USB
maintainer and a devfs update.
Wild guesses follow..
On Thu, 29 Nov 2001, war war wrote:
> In essence, the 'tar' command is finished, however, 30-60 seconds
> after it has finished, it is actually still decompressing the data to
> the file on the disk.
It's probably writing it from RAM to disk. 60 seconds seems like a looong
time, tho. What does iostat -x tell ya during the time when tar is
finished and the disk is still going?
> On Solaris, when I untar a file, the disk stops grinding when the tar
> process is finished, and the system is totally usuable.
You can mount your filesystem synchronously..
--
Blue Lang, editor, b-side.org http://www.b-side.org
2315 McMullan Circle, Raleigh, North Carolina, 27608 919 835 1540
war war wrote:
>
> Issue:
>
Here I go again.
> Is it possible to set the disk cache size to a higher value to avoid
> temporary freezing while untarring large files? Memory is not an
> issue, I have plenty of it. The disk drive is a good drive, does
> 29.2MB/s sustained in single user mode, 25MB/s when I have a lot of
> processes open. Here is what I think is going on. Sometimes, when I
> untar things, or do things that consume a lot of disk space rapidly,
> they do it VERY quickly, and then the disk rumbles on for 5-20
> seconds after it is done. What accounts for this?
What is unclear from your report is how this behaviour differs
from what linux has _always_ done. The kernel implements
delayed writeback. Write data isn't fully flushed until up
to thirty seconds after it was written.
So... what filesystem do you use, and how do you think current
behaviour differs from earlier kernels?
I can tell you a few things: there are basically three ways in
which write() data gets IO started on it:
1: Directly, when someone does a write(), if the amount of pending write
data is too high.
2: From within the VM code, when it detects that the ratios of
free-to-dirty-to-clean memory are getting out of whack.
3: Within the kupdate daemon, when it is detected that the
data is thirty seconds old.
In current kernels, with your sort of workload, it appears that
all IO is being initiated by method 2. It also appears that
method 2 simply doesn't do it very well - I've earlier observed
that simply writing a 650 megabyte chunk of /dev/zero into a
file runs 30% faster on ext3 than on ext2. Because ext2 uses
method 2, and it should be using method 1, and ext3 uses, err,
method 4.
Are you inclined to try a patch, and let us know if the result
is better? (coz if you don't nothing will happen!)
http://www.zip.com.au/~akpm/linux/2.4/2.4.17-pre1/vm-fixes.patch
It causes writeout to be initiated via the dirty buffer LRU, not the
inactive list.
Also,
http://www.zip.com.au/~akpm/linux/2.4/2.4.17-pre1/elevator.patch
It lets you read data from the disk when writes are happening.
-
On Thu, 29 Nov 2001, Andrew Morton wrote:
[snip]
> In current kernels, with your sort of workload, it appears that
> all IO is being initiated by method 2. It also appears that
> method 2 simply doesn't do it very well - I've earlier observed
> that simply writing a 650 megabyte chunk of /dev/zero into a
> file runs 30% faster on ext3 than on ext2. Because ext2 uses
> method 2, and it should be using method 1, and ext3 uses, err,
> method 4.
I too are seeing this and can't remember seeing it so clearly in the past
as I do now. I've seen it in 2.4.10-ac12 and 2.4.15-pre7 and
2.4.16+your_patch_for_reads_while_writing.
I recently backuped my /home on a friends machine and then reformatted my
partition from reiserfs to ext3 and then put all the data back.
When I did this I saw this behaviour very clearly. I have quite a lot of
'vmstat 1' output that shows that the machine is recieving a lot of data
but not writing anything out, this is completely fine, but when the
diskoutput begins the networkinput stops and only continues after the
writeout is complete. There are variations to how much it stalls but it
always stalls for some time, sometimes a short while and sometimes until
the buffer is empty and then the story begins all over again.
the disk can sustain >30MB/s output and the network is 100Mbit/s so it a
maximum of 10-11MB/s in.
I sent the 'vmstat 1' output to Rik van Riel and the response I got on irc
was something like "ouch! Spikey!". he didn't think it should work like
this either. To me it seems that the writeout starts to late or that no
new data is put into the buffer while it's beeing emptied.
> Are you inclined to try a patch, and let us know if the result
> is better? (coz if you don't nothing will happen!)
I'll try this tomorrow and give you a full report with vmstat output and
slabinfo/meminfo and whatever more you might want.
> http://www.zip.com.au/~akpm/linux/2.4/2.4.17-pre1/vm-fixes.patch
>
> It causes writeout to be initiated via the dirty buffer LRU, not the
> inactive list.
>
> Also,
>
> http://www.zip.com.au/~akpm/linux/2.4/2.4.17-pre1/elevator.patch
>
> It lets you read data from the disk when writes are happening.
I'll try with 2.4.17-pre1 + your patches.
All my tests will be done on ext3 filesystems.
/Martin
Never argue with an idiot. They drag you down to their level, then beat you with experience.
war war wrote:
>
> Issue:
>
> Is it possible to set the disk cache size to a higher value to avoid
> temporary freezing while untarring large files? Memory is not an
> issue, I have plenty of it.
Absolutely all free memory may be used for disk caching. So
no, you can't get a bigger cache because it is already at
the highest possible setting. You don't have more memory
for this - all is used already.
> The disk drive is a good drive, does
> 29.2MB/s sustained in single user mode, 25MB/s when I have a lot of
> processes open. Here is what I think is going on. Sometimes, when I
> untar things, or do things that consume a lot of disk space rapidly,
> they do it VERY quickly, and then the disk rumbles on for 5-20
> seconds after it is done. What accounts for this?
>
This is exactly what you should expect with lots of cache:
You run a big untar.
This is written straight to the disk cache RAM, that's why
it finishes very fast. Because it isn't really on disk -
it is in the cache.
You may go on doing other work, the tar is over. But the
data have to get to disk too, not only the cache. That's
the rumbling you notice - stuff being written from cache
onto the disk itself.
>
> In essence, the 'tar' command is finished, however, 30-60 seconds
> after it has finished, it is actually still decompressing the data to
> the file on the disk.
>
It isn't decompressing, merely writing. All decompressing etc.
that "tar" does is done - but the stuff went into your (big)
disk cache. What you hear is the uncompressed stuff being
written from cache to disk.
Of course the files are instantly useable even when they aren't yet
written to disk. This because you actually get stuff from the cache,
never from the disk itself.
> I have not tested ALL kernels for when or where this has started, but
> could someone provide a further explanation as to why the disk
> scheduluer works like this?
It always worked this way. Forever.
> On Solaris, when I untar a file, the disk stops grinding when the tar
> process is finished, and the system is totally usuable.
Synchronously mounted then. Worse performance, but safer if you
have the bad habit of turning the machine off when you
_believe_ it is finished.
You can force this behaviour under linux too - use
"sync" to force synchronization when you feel you need it.
Or mount syncrhonously - but then you take a performance
hit all the time.
> With Linux, when I untar the file, the system may completely lock up
> for 3-5 seconds during the duration after the initial untar (which is
> 30-60 seconds) after the untar processes has ended.
Some disk systems are cpu intensive. SCSI (or properly
tuned IDE using (u)dma and irq unmasking) is much better.
Helge Hafting
On Fri, 30 Nov 2001 14:59:00 +0100
Helge Hafting <[email protected]> wrote:
> Absolutely all free memory may be used for disk caching. So
> no, you can't get a bigger cache because it is already at
> the highest possible setting. You don't have more memory
> for this - all is used already.
May I limit this memory ? For a long time I'm working all day with no physical memory available.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Pablo Borges [email protected]
-------------------------------------------------------------------
____ Tecnologia UOL
/ \ Debian:
| =_/ The 100% suck free linux distro.
\
\ SETI is lame. http://www.distributed.net
Dnetc is XNUG!
> > Absolutely all free memory may be used for disk caching. So
> > no, you can't get a bigger cache because it is already at
> > the highest possible setting. You don't have more memory
> > for this - all is used already.
>
> May I limit this memory ? For a long time I'm working all day with no physical memory available.
You can try rtlinux. In rtlinux (realtime linux), you tell linux how much
memory the kernel will have access to, and let specially written apps to
take the rest
--
Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA
Computers are like air conditioners.
They stop working when you open Windows.
Don't we have a "dont't eat my whole memory, disk cache" option on linux ?
On Wed, 5 Dec 2001 21:07:42 +0100 (CET)
Roy Sigurd Karlsbakk <[email protected]> wrote:
> > > Absolutely all free memory may be used for disk caching. So
> > > no, you can't get a bigger cache because it is already at
> > > the highest possible setting. You don't have more memory
> > > for this - all is used already.
> >
> > May I limit this memory ? For a long time I'm working all day with no
> > physical memory available.
>
> You can try rtlinux. In rtlinux (realtime linux), you tell linux how
> much memory the kernel will have access to, and let specially written
> apps to take the rest
> --
> Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA
>
> Computers are like air conditioners.
> They stop working when you open Windows.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel"
> in the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Pablo Borges [email protected]
-------------------------------------------------------------------
____ Tecnologia UOL
/ \ Debian:
| =_/ The 100% suck free linux distro.
\
\ SETI is lame. http://www.distributed.net
Dnetc is XNUG!
Is it really neccecary? Free memory's a waste! The cache will be discarded
the moment an application needs the memory.
what's the problem? It speeds up disk I/O for recently used files
On Thu, 6 Dec 2001, Pablo Borges wrote:
>
> Don't we have a "dont't eat my whole memory, disk cache" option on linux ?
>
>
> On Wed, 5 Dec 2001 21:07:42 +0100 (CET)
> Roy Sigurd Karlsbakk <[email protected]> wrote:
>
> > > > Absolutely all free memory may be used for disk caching. So
> > > > no, you can't get a bigger cache because it is already at
> > > > the highest possible setting. You don't have more memory
> > > > for this - all is used already.
> > >
> > > May I limit this memory ? For a long time I'm working all day with no
> > > physical memory available.
> >
> > You can try rtlinux. In rtlinux (realtime linux), you tell linux how
> > much memory the kernel will have access to, and let specially written
> > apps to take the rest
> > --
> > Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA
> >
> > Computers are like air conditioners.
> > They stop working when you open Windows.
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel"
> > in the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> >
>
>
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Pablo Borges [email protected]
> -------------------------------------------------------------------
> ____ Tecnologia UOL
> / \ Debian:
> | =_/ The 100% suck free linux distro.
> \
> \ SETI is lame. http://www.distributed.net
> Dnetc is XNUG!
>
--
Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA
Computers are like air conditioners.
They stop working when you open Windows.
So ppl, plz help-me find this problem.
Once upon a time, I had 30/60 days uptime on my workstation, only rebooting for kernel updates or power faults. Nowadays, I've to reboot my machine every day because there's no physical memory available. I watch my disk burning in heavy I/O to open my X, browser, etc and stuff.
Then, one day, I decided to init 1; umount all and I saw the amazing release of 70% for my physical memory. I assumed that the disk cache had it all.
So, if this is correct, I want or to keep the cache under control or to flush it when I want. Fran?ois Cami told me there is an option on aa's VM that helps me with that
> easy.
> echo 400 > /proc/sys/vm/vm_mapped_ratio
> the higher the number, the smaller/less aggressive the disk cache.
Tnx for your help,
[]'s
Pablo
On Thu, 6 Dec 2001 19:10:56 +0100 (CET)
Roy Sigurd Karlsbakk <[email protected]> wrote:
> Is it really neccecary? Free memory's a waste! The cache will be
> discarded the moment an application needs the memory.
>
> what's the problem? It speeds up disk I/O for recently used files
>
> On Thu, 6 Dec 2001, Pablo Borges wrote:
>
> >
> > Don't we have a "dont't eat my whole memory, disk cache" option on
> > linux ?
> >
> >
> > On Wed, 5 Dec 2001 21:07:42 +0100 (CET)
> > Roy Sigurd Karlsbakk <[email protected]> wrote:
> >
> > > > > Absolutely all free memory may be used for disk caching. So
> > > > > no, you can't get a bigger cache because it is already at
> > > > > the highest possible setting. You don't have more memory
> > > > > for this - all is used already.
> > > >
> > > > May I limit this memory ? For a long time I'm working all day with
> > > > no physical memory available.
> > >
> > > You can try rtlinux. In rtlinux (realtime linux), you tell linux how
> > > much memory the kernel will have access to, and let specially
> > > written apps to take the rest
> > > --
> > > Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA
> > >
> > > Computers are like air conditioners.
> > > They stop working when you open Windows.
> > >
> > > -
> > > To unsubscribe from this list: send the line "unsubscribe
> > > linux-kernel" in the body of a message to [email protected]
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > Please read the FAQ at http://www.tux.org/lkml/
> > >
> >
> >
> > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> > Pablo Borges [email protected]
> > -------------------------------------------------------------------
> > ____ Tecnologia UOL
> > / \ Debian:
> > | =_/ The 100% suck free linux distro.
> > \
> > \ SETI is lame. http://www.distributed.net
> > Dnetc is XNUG!
> >
>
> --
> Roy Sigurd Karlsbakk, MCSE, MCNE, CLS, LCA
>
> Computers are like air conditioners.
> They stop working when you open Windows.
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel"
> in the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Pablo Borges [email protected]
-------------------------------------------------------------------
____ Tecnologia UOL
/ \ Debian:
| =_/ The 100% suck free linux distro.
\
\ SETI is lame. http://www.distributed.net
Dnetc is XNUG!
On Thu, 6 Dec 2001, Roy Sigurd Karlsbakk wrote:
> Is it really neccecary? Free memory's a waste! The cache will be
> discarded the moment an application needs the memory.
That's not the case with use-once ...
Rik
--
DMCA, SSSCA, W3C? Who cares? http://thefreeworld.net/
http://www.surriel.com/ http://distro.conectiva.com/
On Thu, 6 Dec 2001, Rik van Riel wrote:
> On Thu, 6 Dec 2001, Roy Sigurd Karlsbakk wrote:
>
> > Is it really neccecary? Free memory's a waste! The cache will be
> > discarded the moment an application needs the memory.
>
> That's not the case with use-once ...
A little more verbosity please?
-Mike
On Thu, 6 Dec 2001, Mike Galbraith wrote:
> On Thu, 6 Dec 2001, Rik van Riel wrote:
> > On Thu, 6 Dec 2001, Roy Sigurd Karlsbakk wrote:
> >
> > > Is it really neccecary? Free memory's a waste! The cache will be
> > > discarded the moment an application needs the memory.
> >
> > That's not the case with use-once ...
>
> A little more verbosity please?
Once a page is used twice, it's not a candidate for eviction
until (most of) the use-once pages are gone.
This means that if you have these 40 MB of used-twice-but-never-again
buffer cache memory, this memory will never be evicted until other
pages get promoted from use-once to active.
Now say you have 200 MB of RAM, 40 MB of which are the above
buffer cache pages. Now you start a program which needs 170
MB of RAM.
This 170 MB program touches each page once before starting
at the front again, which means all its pages are used once
before getting evicted ... and they never get promoted to
active pages so the 40 MB of no longer used buffer cache
never gets evicted.
Use-once has this property in principle and the warnings have
gone out since around 2.4.8-pre4, but it's in 2.4 now so you're
stuck with it.
cheers,
Rik
--
DMCA, SSSCA, W3C? Who cares? http://thefreeworld.net/
http://www.surriel.com/ http://distro.conectiva.com/
> Once a page is used twice, it's not a candidate for eviction
> until (most of) the use-once pages are gone.
>
> This means that if you have these 40 MB of used-twice-but-never-again
> buffer cache memory, this memory will never be evicted until other
> pages get promoted from use-once to active.
Its worth noting btw that you can intentionally exploit this in an app
to get unfair use of memory. That makes me very dubious about the heuristic
Alan
On Thu, 6 Dec 2001, Alan Cox wrote:
> > Once a page is used twice, it's not a candidate for eviction
> > until (most of) the use-once pages are gone.
> >
> > This means that if you have these 40 MB of used-twice-but-never-again
> > buffer cache memory, this memory will never be evicted until other
> > pages get promoted from use-once to active.
>
> Its worth noting btw that you can intentionally exploit this in an app
> to get unfair use of memory. That makes me very dubious about the heuristic
In Rik's VM I had a problem with use-once when Bonnie was doing
rewrite. It's used-twice data became too hard to get rid of at
the aging volume we were doing, leading to an inactive shortage
and unwanted swapping. The active list grew until ~all of ram
was on the active list. I 'fixed' it here by keeping the dirty
list very strictly ordered (lengthened it too) and requiring more
than two accesses before promoting to active.
I have not seen this behavior in the new VM yet.
-Mike
> In Rik's VM I had a problem with use-once when Bonnie was doing
> rewrite. It's used-twice data became too hard to get rid of at
You are not supposed to use Riel's VM with use-once. The two were never
intended to be combined.
Alan
On Thu, 6 Dec 2001 19:10:56 +0100 (CET)
Roy Sigurd Karlsbakk <[email protected]> wrote:
> Is it really neccecary? Free memory's a waste! The cache will be discarded
> the moment an application needs the memory.
This is not true for all cases.
Regards,
Stephan
On Fri, 7 Dec 2001, Alan Cox wrote:
> > In Rik's VM I had a problem with use-once when Bonnie was doing
> > rewrite. It's used-twice data became too hard to get rid of at
>
> You are not supposed to use Riel's VM with use-once. The two were never
> intended to be combined.
I like the idea behind use-once very much, but given the side-effects
seen here.... I'm not sure.
-Mike
On Fri, 7 Dec 2001, Mike Galbraith wrote:
> On Fri, 7 Dec 2001, Alan Cox wrote:
>
> > > In Rik's VM I had a problem with use-once when Bonnie was doing
> > > rewrite. It's used-twice data became too hard to get rid of at
> >
> > You are not supposed to use Riel's VM with use-once. The two were never
> > intended to be combined.
>
> I like the idea behind use-once very much, but given the side-effects
> seen here.... I'm not sure.
Page aging achieves something pretty close to use-once, but
without the side effects. Pages which are used once put some
pressure on the working set, but very little.
kind regards,
Rik
--
Shortwave goes a long way: irc.starchat.net #swl
http://www.surriel.com/ http://distro.conectiva.com/