2002-01-16 19:04:56

by Andrea Arcangeli

[permalink] [raw]
Subject: Rik spreading bullshit about VM

I read here:

http://linux.html.it/articoli/rik_van_riel_ita1.htm

[..] La nuova VM ha migliori performance rispetto alla vecchia sui
tipici sistemi desktop ... ma fa fiasco terribilmente su pi? sistemi di
quanto non lo facesse la vecchia VM. Redhat, per esempio, non ha potuto
inserire la nuova VM nella sua distribuzione perch? cadrebbe a pezzi per
i database server, [..]

This is total bullshit. If there's something where the -aa VM is good
are the DBMS, that was designed for it basically, very lightweight,
basically no VM overhead also under very heavy I/O.

If redhat doesn't use the -aa VM into their kernels that's either a
political decision or they're not good enough at the VM. I can tell you
there are an huge number of users very happy about the -aa VM mainly on
DBMS, I know some redhat partner is also using the -aa VM for their
internal DBMS work (incidentally I assume rh kernels aren't good enough
for them?). I don't care much of interviews, real people will make their
choice about the VM based on facts anyways, you may forbid someone to
try or you may convince someone that doesn't need much VM anyways saying
weird things like "collapse under high load or breaks on databases", but
still it's annoying to read such a bullshit and have people sending
emails to you about stuff that it isn't true.

So if you can proof any of your statements with numbers about
regressions with DB with -aa VM compared to 2.4.9 or
2.4.9+ac/rh/whatever please let us know. I have an huge amount of doc to
proof the exact opposite (the first few impressive emails I found are
attached) and most important I don't have a single bugreport about the
current 2.4.18pre2aa2 VM (except perhaps the bdflush wakeup that seems
to be a little too late and that deals to lower numbers with slow write
load etc.., fixable with bdflush tuning). Mainline VM kills too easily,
this is fixed in -aa VM and -aa VM has a number of other issues
resolved, but mainline 2.4 vm isn't that far either. In the last few
days I was playing with pte-highmem, soon I will spend some time merging
-aa VM into mainline with Marcelo if he likes to.

Andrea

PS. I know the interviewer and he's usually very accurate, so I don't
think this could be a misunderstanding where you say one thing and they
writer another one just to create troubles.


Attachments:
(No filename) (2.27 kB)
(No filename) (8.82 kB)
(No filename) (2.69 kB)
(No filename) (3.01 kB)
(No filename) (3.44 kB)
Download all attachments

2002-01-16 20:19:08

by Jose Luis Domingo Lopez

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Wednesday, 16 January 2002, at 20:04:59 +0100,
Andrea Arcangeli wrote:

> I read here:
>
> http://linux.html.it/articoli/rik_van_riel_ita1.htm
>
I'm not a programmer. I'm not a kernel hacker. I don't like _very_
valuable people wasting their time reading messages like mine. I give
credit to all of you (Andrea, Rik, Linus, Alan and everyine else), and
sincerely thank you for your hard and good work. I don't try to start or
give fuel to a flame war.

But, Andrea, sometimes people like me _feel_ that bugs supposedly related
to your VM implementation, and reported in this list, go without your
attention. I'm completely sure you read all bug reports, and work
"off-line" to solve them (-aa seems to be where they live). I am not of
those who like talking a lot, but would be very nice (and it will sure
decrease repeated bug reports and neverending threads about known
problems) to sometimes answer some bug reports.

I try to keep in sync with the list, and it's very difficult to find
if those "corner cases" your VM seem to still suffer under are solved in
your three, known but not solved, solved and merged, solved and sent to
Marcelo, etc.

In short, is more a feeling of believing that no one cares about the few
problems that still exist than "objective unstability" of the VM.

Just my 0.02.

--
Jos? Luis Domingo L?pez
Linux Registered User #189436 Debian Linux Woody (P166 64 MB RAM)

jdomingo AT internautas DOT org => Spam at your own risk

2002-01-16 20:46:49

by Bongani Hlope

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Wed, 2002-01-16 at 21:04, Andrea Arcangeli wrote:
>
[---SNIP---]
> 3) Details of typical runs:
>
> 2.4.7:

[---SNIP---]

> ------

>
> c)
> meminfo during run:
> total: used: free: shared: buffers: cached:
> Mem: 1051394048 1047044096 4349952 443473920 1052672 294608896
> Swap: 4294934528 1314680832 2980253696
> MemTotal: 1026752 kB
> MemFree: 4248 kB
> MemShared: 433080 kB <============== [1]
> Buffers: 1028 kB
> Cached: 20900 kB
> SwapCached: 266804 kB
> Active: 694700 kB
> Inact_dirty: 19296 kB
> Inact_clean: 7816 kB
> Inact_target: 12836 kB
> HighTotal: 131072 kB
> HighFree: 1460 kB
> LowTotal: 895680 kB
> LowFree: 2788 kB
> SwapTotal: 4194272 kB
> SwapFree: 2910404 kB
> NrSwapPages: 727602 pages
>
>
> ###################################################################
>
> 2.4.14:
> --------
>
[--SNIP--]
> c)
> meminfo during run:
> total: used: free: shared: buffers: cached:
> Mem: 1052712960 1046528000 6184960 0 319488 856850432
> Swap: 4294934528 1313320960 2981613568
> MemTotal: 1028040 kB
> MemFree: 6040 kB
> MemShared: 0 kB <================ [2]
> Buffers: 312 kB
> Cached: 714576 kB
> SwapCached: 122192 kB
> Active: 851084 kB
> Inactive: 113592 kB
> HighTotal: 131072 kB
> HighFree: 2044 kB
> LowTotal: 896968 kB
> LowFree: 3996 kB
> SwapTotal: 4194272 kB
> SwapFree: 2911732 kB

Andrea your VM also works perfectly for me, and I think you are doing
a brilliant job on it. The only thing that I have noticed that worries
me is the lack of MemShared. I wass about to study the code and try to
find out why it is always 0 kB on my PC, but these stats also show the
same results. Do you have any idea why this is so. I will still study
the code (just for the fun of it), maybe I might learn something about
te VM system.

-Bongani

2002-01-16 20:53:09

by John Levon

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Wed, Jan 16, 2002 at 10:58:31PM +0200, Bongani Hlope wrote:

> a brilliant job on it. The only thing that I have noticed that worries
> me is the lack of MemShared. I wass about to study the code and try to
> find out why it is always 0 kB on my PC, but these stats also show the
> same results. Do you have any idea why this is so. I will still study
> the code (just for the fun of it), maybe I might learn something about
> te VM system.

http://www.tux.org/lkml/#s14-3

> Please read the FAQ at http://www.tux.org/lkml/


regards
john

--
"Now why did you have to go and mess up the child's head, so you can get another gold waterbed ?
You fake-hair contact-wearing liposuction carnival exhibit, listen to my rhyme ..."

2002-01-16 20:58:59

by Richard Gooch

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

Andrea Arcangeli writes:
>
> --2fHTh5uZTiUOsy+g
> Content-Type: text/plain; charset=iso-8859-1
> Content-Disposition: inline
> Content-Transfer-Encoding: 8bit
>
> I read here:
>
> http://linux.html.it/articoli/rik_van_riel_ita1.htm
>
> [..] La nuova VM ha migliori performance rispetto alla vecchia sui
> tipici sistemi desktop ... ma fa fiasco terribilmente su pi? sistemi di
> quanto non lo facesse la vecchia VM. Redhat, per esempio, non ha potuto
> inserire la nuova VM nella sua distribuzione perch? cadrebbe a pezzi per
> i database server, [..]
>
> This is total bullshit. If there's something where the -aa VM is good
> are the DBMS, that was designed for it basically, very lightweight,
> basically no VM overhead also under very heavy I/O.

I don't know why you're so upset. As far as I can tell, Rik has warmly
praised your VM in the above message. Of course, since you didn't
provide an English translation, I can't really be sure. Perhaps Rik
was talking about Virtual Machines, and not Virtual Memory? Or perhaps
Virgin Mary?

Regards,

Richard....
Permanent: [email protected]
Current: [email protected]

2002-01-16 21:10:33

by Bongani Hlope

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Wed, 2002-01-16 at 22:55, John Levon wrote:
> On Wed, Jan 16, 2002 at 10:58:31PM +0200, Bongani Hlope wrote:
>
> > a brilliant job on it. The only thing that I have noticed that worries
> > me is the lack of MemShared. I wass about to study the code and try to
> > find out why it is always 0 kB on my PC, but these stats also show the
> > same results. Do you have any idea why this is so. I will still study
> > the code (just for the fun of it), maybe I might learn something about
> > te VM system.
>
> http://www.tux.org/lkml/#s14-3
>
> > Please read the FAQ at http://www.tux.org/lkml/
>
>
> regards
> john

Thank you I missed that, its been a while since I read the FAQ

Thank
Bongani
>
> --
> "Now why did you have to go and mess up the child's head, so you can get another gold waterbed ?
> You fake-hair contact-wearing liposuction carnival exhibit, listen to my rhyme ..."
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/


2002-01-16 21:10:39

by Dave Jones

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Wed, 16 Jan 2002, Richard Gooch wrote:

> > [..] La nuova VM ha migliori performance rispetto alla vecchia sui
> > tipici sistemi desktop ... ma fa fiasco terribilmente su pi? sistemi di
> > quanto non lo facesse la vecchia VM. Redhat, per esempio, non ha potuto
> > inserire la nuova VM nella sua distribuzione perch? cadrebbe a pezzi per
> > i database server, [..]

> I don't know why you're so upset. As far as I can tell, Rik has warmly
> praised your VM in the above message. Of course, since you didn't
> provide an English translation, I can't really be sure.

English (Well, some crazy moon-language dialect of english) translation:-
http://linux.html.it/articoli/rik_van_riel_en1.htm

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2002-01-16 21:17:39

by Craig Knox

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

English translation:
http://linux.html.it/articoli/rik_van_riel_en1.htm

> I don't know why you're so upset. As far as I can tell, Rik has warmly
> praised your VM in the above message. Of course, since you didn't
> provide an English translation, I can't really be sure. Perhaps Rik
> was talking about Virtual Machines, and not Virtual Memory? Or perhaps
> Virgin Mary?
>
> Regards,
>
> Richard....
> Permanent: [email protected]
> Current: [email protected]
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/


2002-01-16 21:20:21

by Rik van Riel

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Wed, 16 Jan 2002, Dave Jones wrote:
> On Wed, 16 Jan 2002, Richard Gooch wrote:
>
> > > [..] La nuova VM ha migliori performance rispetto alla vecchia sui
> > > tipici sistemi desktop ... ma fa fiasco terribilmente su pi? sistemi di
> > > quanto non lo facesse la vecchia VM. Redhat, per esempio, non ha potuto
> > > inserire la nuova VM nella sua distribuzione perch? cadrebbe a pezzi per
> > > i database server, [..]
>
> > I don't know why you're so upset. As far as I can tell, Rik has warmly
> > praised your VM in the above message. Of course, since you didn't
> > provide an English translation, I can't really be sure.
>
> English (Well, some crazy moon-language dialect of english) translation:-
> http://linux.html.it/articoli/rik_van_riel_en1.htm

It seems the IRC log of the journalist in question was missing
some lines of what I said and he just glued together the remaining
parts of the paragraph for that particular question ;)

The rest of the interview seems to have survived pretty ok, though.

cheers,

Rik
--
"Linux holds advantages over the single-vendor commercial OS"
-- Microsoft's "Competing with Linux" document

http://www.surriel.com/ http://distro.conectiva.com/

2002-01-16 21:30:31

by Adam Kropelin

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

Andrea Arcangeli wrote:
<snip>
>I don't have a single bugreport about the current 2.4.18pre2aa2 VM (except
>perhaps the bdflush wakeup that seems to be a little too late and that deals to
>lower numbers with slow write load etc.., fixable with bdflush tuning).

I don't know if this is a reference to the issue I reported under the "Writeout in
recent kernels..." thread or not. If not, my apologies for clogging up this new
"discussion".

As reported[0] in the above-mentioned thread, the bdflush tuning parameters
you suggested made no difference in my test case other than slightly adjusting
the temporal relationship between writeout and file transfer. -aa still performs
slightly worse than both 2.4.17 stock and -rmap. 2.4.13-ac7 currently beats
all competitors.

--Adam

[0] http://www.kroptech.com:8300/mailimport/showmsg.php?msg_id=49746&db_name=linux_kernel

2002-01-16 21:55:09

by grundig

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Wed, 16 Jan 2002, Andrea Arcangeli wrote:
> attached) and most important I don't have a single bugreport about the
> current 2.4.18pre2aa2 VM (except perhaps the bdflush wakeup that seems
> to be a little too late and that deals to lower numbers with slow write
> load etc.., fixable with bdflush tuning). Mainline VM kills too easily,

Well, I haven't reported it yet, but booting my box with mem=4M
gave as result: (running 2.4.18-pre2aa2):
diego# cat /var/log/messages | grep gfp
Jan 13 15:37:10 localhost kernel: __alloc_pages: 0-order allocation failed
(gfp=0xf0/0)
Jan 15 16:06:28 localhost kernel: __alloc_pages: 0-order allocation failed
(gfp=0xf0/0)
Jan 15 18:37:21 localhost kernel: __alloc_pages: 0-order allocation failed
(gfp=0xf0/0)
Jan 15 21:58:32 localhost kernel: __alloc_pages: 0-order allocation failed
(gfp=0xf0/0)
Jan 15 21:58:33 localhost kernel: __alloc_pages: 0-order allocation failed
(gfp=0xf0/0)
diego#

Each script of /etc/rc.d was killed by VM when it was started, there wasn't
any "OOM", just
"VM killed..." or something similar.
As /etc/rc.d scripts were killed, I couldn't start swap.

The gfp=0x... numbers were not always the same, but I can't remember them
because syslogd wasn't running.
I can repeat this if you want and I'll copy all messages.

..I remember running 2.2.14 in a 386 box with 4MB of RAM and 8 or 16 of
swap. It was veeery slow, but even I could run apache :-)...



> this is fixed in -aa VM and -aa VM has a number of other issues
> resolved, but mainline 2.4 vm isn't that far either. In the last few
> days I was playing with pte-highmem, soon I will spend some time merging
> -aa VM into mainline with Marcelo if he likes to.
>
> Andrea
>
> PS. I know the interviewer and he's usually very accurate, so I don't
> think this could be a misunderstanding where you say one thing and they
> writer another one just to create troubles.
>
>

2002-01-16 21:50:19

by Dieter Nützel

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

Adam Kropelin wrote:
> Andrea Arcangeli wrote:
> <snip>
> >I don't have a single bugreport about the current 2.4.18pre2aa2 VM (except
> >perhaps the bdflush wakeup that seems to be a little too late and that
> >deals to lower numbers with slow write load etc.., fixable with bdflush
> >tuning).
>
> I don't know if this is a reference to the issue I reported under the
> "Writeout in recent kernels..." thread or not. If not, my apologies for
> clogging up this new "discussion".
>
> As reported[0] in the above-mentioned thread, the bdflush tuning parameters
> you suggested made no difference in my test case other than slightly
> adjusting the temporal relationship between writeout and file transfer. -aa
> still performs slightly worse than both 2.4.17 stock and -rmap. 2.4.13-ac7
> currently beats all competitors.

Put Andrew's read-latency.patch on -aa (10_vm-22) and see what you get out of
it. It should fly...

-Dieter

--
Dieter N?tzel
Graduate Student, Computer Science

University of Hamburg
Department of Computer Science
@home: [email protected]

2002-01-16 21:55:49

by grundig

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Wed, 16 Jan 2002, Andrea Arcangeli wrote:
> attached) and most important I don't have a single bugreport about the
> current 2.4.18pre2aa2 VM (except perhaps the bdflush wakeup that seems
> to be a little too late and that deals to lower numbers with slow write
> load etc.., fixable with bdflush tuning). Mainline VM kills too easily,

Well, I haven't reported it yet, but booting my box with mem=4M
gave as result: (running 2.4.18-pre2aa2):
diego# cat /var/log/messages | grep gfp
Jan 13 15:37:10 localhost kernel: __alloc_pages: 0-order allocation failed
(gfp=0xf0/0)
Jan 15 16:06:28 localhost kernel: __alloc_pages: 0-order allocation failed
(gfp=0xf0/0)
Jan 15 18:37:21 localhost kernel: __alloc_pages: 0-order allocation failed
(gfp=0xf0/0)
Jan 15 21:58:32 localhost kernel: __alloc_pages: 0-order allocation failed
(gfp=0xf0/0)
Jan 15 21:58:33 localhost kernel: __alloc_pages: 0-order allocation failed
(gfp=0xf0/0)
diego#

Each script of /etc/rc.d was killed by VM when it was started, there wasn't
any "OOM", just
"VM killed..." or something similar.
As /etc/rc.d scripts were killed, I couldn't start swap.

The gfp=0x... numbers were not always the same, but I can't remember them
because syslogd wasn't running.
I can repeat this if you want and I'll copy all messages.

..I remember running 2.2.14 in a 386 box with 4MB of RAM and 8 or 16 of
swap. It was veeery slow, but even I could run apache :-)...



> this is fixed in -aa VM and -aa VM has a number of other issues
> resolved, but mainline 2.4 vm isn't that far either. In the last few
> days I was playing with pte-highmem, soon I will spend some time merging
> -aa VM into mainline with Marcelo if he likes to.
>
> Andrea
>
> PS. I know the interviewer and he's usually very accurate, so I don't
> think this could be a misunderstanding where you say one thing and they
> writer another one just to create troubles.
>
>

2002-01-16 22:03:19

by Rik van Riel

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Wed, 16 Jan 2002, Diego Calleja wrote:
> On Wed, 16 Jan 2002, Andrea Arcangeli wrote:
> > attached) and most important I don't have a single bugreport about the
> > current 2.4.18pre2aa2 VM (except perhaps the bdflush wakeup that seems
>
> Well, I haven't reported it yet, but booting my box with mem=4M
> gave as result: (running 2.4.18-pre2aa2):
> diego# cat /var/log/messages | grep gfp
> Jan 13 15:37:10 localhost kernel: __alloc_pages: 0-order allocation failed
> (gfp=0xf0/0)

> Each script of /etc/rc.d was killed by VM when it was started,

It seems Andrea's patch backs out a bugfix for this problem
which marcelo and me put into the normal 2.4 kernel ...

regards,

Rik
--
"Linux holds advantages over the single-vendor commercial OS"
-- Microsoft's "Competing with Linux" document

http://www.surriel.com/ http://distro.conectiva.com/

2002-01-16 22:45:03

by Chris Chabot

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM


> Test hardware:
> 4 way Dell, 4 GB physical RAM, SCSI/RAID subsystem,
> DB runs on FS.

Can we first make sure that the other factors dont plat a rol in this
benchmark? I have a couple (14+) Dell servers here, and i know for a
fact that most of their RAID systems are heavely borked in the
performance department.

All kernels upto 2.4.1x performed horibly, and all kernels after 2.4.16
or so perform horibly again! Somewhere inbetween some magic seemed to
happen in the block layer / elevator code / etc, that caused performance
to increase upto 100% on the Dell PERC adapters. (started @ the first
release of the AA VM). However after a few small releases, the
performance went down to the same old horible level again.

So it might well be (very likely actualy) that the tested redhat 2.4.14
is a performance 'sweet spot' kernel, where kernels < 2.4.13 and >
2.4.15 or 16 are definatly not.

The raid performance is a whole issue on its self. part seems to be
block IO / Elevator / driver related, and a part seems to be adapter
firmware related. (And adaptec refusing to release their drivers).

However since both 2.4.17 and 2.4.7 have the same horible RAID
performance, i do not think the VM is responcible for that part ;-)

A good test would be to configure those disks on a normal AIC7xxx
adapter, and software raiding them together. The performance of that is
'equal' between those different kernels, and much much higher then the
hardware raid. Benchmarking with this would give much better results for
benchmarking VM's

-- Chris




2002-01-17 00:08:33

by Erik Mouw

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Wed, Jan 16, 2002 at 08:04:59PM +0100, Andrea Arcangeli wrote:
> I read here:
>
> http://linux.html.it/articoli/rik_van_riel_ita1.htm

[...]

> This is total bullshit. If there's something where the -aa VM is good
> are the DBMS, that was designed for it basically, very lightweight,
> basically no VM overhead also under very heavy I/O.

Sorry, but in my opinion Rik's rmap VM still beats your VM under IO
load. My benchmark is very simple: import a kernel tree into a CVS tree
that already contains about 470 other kernel trees. Both the import
directory and the CVS root are on the same disk. With 2.4.17 the mp3
player stutters, I can't even read email or edit a couple of files with
XEmacs at the same time. With 2.4.17-rmap-11a the mp3 player runs
smoothly and email and XEmacs are usable again.

Some time ago Linus made the important observation that we shouldn't
tune the scheduler for SMP systems simply because 99.9% of the systems
in the world running linux have a single CPU. IMHO an equally well
observation would be that we shouldn't tune the VM for the 0.1% of the
systems in this world that run large DMBSes. The 99.9% majority is much
more important.

Just my 0.02.


Erik

--
J.A.K. (Erik) Mouw, Information and Communication Theory Group, Faculty
of Information Technology and Systems, Delft University of Technology,
PO BOX 5031, 2600 GA Delft, The Netherlands Phone: +31-15-2783635
Fax: +31-15-2781843 Email: [email protected]
WWW: http://www-ict.its.tudelft.nl/~erik/

2002-01-17 00:20:53

by Luigi Genoni

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM



On Wed, 16 Jan 2002, Dave Jones wrote:

> On Wed, 16 Jan 2002, Richard Gooch wrote:
>
> > > [..] La nuova VM ha migliori performance rispetto alla vecchia sui
> > > tipici sistemi desktop ... ma fa fiasco terribilmente su pi? sistemi di
> > > quanto non lo facesse la vecchia VM. Redhat, per esempio, non ha potuto
> > > inserire la nuova VM nella sua distribuzione perch? cadrebbe a pezzi per
> > > i database server, [..]
>
> > I don't know why you're so upset. As far as I can tell, Rik has warmly
> > praised your VM in the above message. Of course, since you didn't
> > provide an English translation, I can't really be sure.
>
> English (Well, some crazy moon-language dialect of english) translation:-
> http://linux.html.it/articoli/rik_van_riel_en1.htm
err, the english version is the original one...

>
> --
> | Dave Jones. http://www.codemonkey.org.uk
> | SuSE Labs
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2002-01-17 00:21:35

by Luigi Genoni

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM


Here is the english version:

The new VM has better performance than the old VM for typical desktop systems ... but it fails horribly on more systems than the old VM. Redhat, for example, cannot ship the new VM in their distribution because it'll just fall apart for the database servers, some of their users run at least now my code is gone I no longer have to work together with Linus, which is a good thing ;)



As a little comment, I would like to avoid to talk about the bad taste
and the poor style used to say those thing in this way.
If you just go to read lkml archive you can find tents of mail
of people who were just waiting for AA VM to solve the problems they have
with their eavilly stressed DB (I am one of them). That is true for small dbs,
and huge dbs with some GB of RAM used.
So basically Ri's assertion is far from truth on many aspects.

Luigi

2002-01-17 00:26:13

by jjs

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

Erik Mouw wrote:

>Sorry, but in my opinion Rik's rmap VM still beats your VM under IO
>load. My benchmark is very simple: import a kernel tree into a CVS tree
>that already contains about 470 other kernel trees. Both the import
>directory and the CVS root are on the same disk. With 2.4.17 the mp3
>player stutters, I can't even read email or edit a couple of files with
>XEmacs at the same time. With 2.4.17-rmap-11a the mp3 player runs
>smoothly and email and XEmacs are usable again.
>
Nice try, but do the test again with 2.4.18-pre2-aa2 -

2.4.17 doesn't neccesarily have all andrea's fixes.

Just my .02

jjs

2002-01-17 00:32:13

by Rik van Riel

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Thu, 17 Jan 2002, V-man wrote:

> So basically Ri's assertion is far from truth on many aspects.

That assumes it is my assertion, it appears the journalist in
question is missing a few lines from his IRC log though...

Rik
--
"Linux holds advantages over the single-vendor commercial OS"
-- Microsoft's "Competing with Linux" document

http://www.surriel.com/ http://distro.conectiva.com/

2002-01-17 00:39:23

by Rik van Riel

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Wed, 16 Jan 2002, Andrea Arcangeli wrote:

> I read here:
>
> http://linux.html.it/articoli/rik_van_riel_ita1.htm
>
> This is total bullshit.

> PS. I know the interviewer and he's usually very accurate, so I don't
> think this could be a misunderstanding where you say one thing and
> they writer another one just to create troubles.

1) the journalist may be good, but his english was far from
fluent, there was some confusion at times during the
interview

2) the interview was done quite informally on IRC, with me
replying in normal IRC style ... it appears however
that the sentence fragments were just cut'n'pasted
together into something gramatically dubious, this has
messed up the contents in some places ;)

3) I guess this whole stuff was converted to gramatically
correct Italian, possibly meaning something slightly
different from the text in (2), definately something
else than what I wanted to say.

I guess this is the last time I'm giving an interview to a
journalist who isn't fluent in any of the languages I'm
fluent in ;)

cheers,

Rik
--
"Linux holds advantages over the single-vendor commercial OS"
-- Microsoft's "Competing with Linux" document

http://www.surriel.com/ http://distro.conectiva.com/

2002-01-17 01:17:06

by Andrea Scrimieri

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Wed, 16 Jan 2002, Rik van Riel wrote:

> It seems the IRC log of the journalist in question was missing
> some lines of what I said and he just glued together the remaining
> parts of the paragraph for that particular question ;)
>
> The rest of the interview seems to have survived pretty ok, though.


I'm sorry but you are wrong. The interview was published untouched.
Nothing was cut or moved. These are my IRC logs about the paragraph you're
talking about.


---START---
[msg(riel)] With kernel 2.4.10 we have seen that Linus Torvalds has
preferred Arcangeli's VM to yours. What do you think of his decision? And
why has he made that?


[riel([email protected])] it was a strange
situation, first Linus ignores bugfixes by me and Alan for almost a year,
then he complains we "didn't send" him the bugfixes and he replaces the VM
of course, the new VM has better performance than the old VM for typical
desktop systems ... but it fails horribly on more systems than the old VM
Redhat, for example, cannot ship the new VM in their distribution because
it'll just fall apart for the database servers some of their users run at
least now my code is gone I no longer have to work together with Linus,
which is a good thing ;)

[msg(riel)] Why is it a good thing?

[riel([email protected])] with Linus out of the
way,
I can make a good VM. I no longer have to worry about what Linus likes or
doesn't like. This is mostly important for intermediary code, where some
of
the "ingredients" to a VM are in place and others aren't yet in place such
code can look ugly or pointless if you don't have the time to look at the
design for a few days, so Linus tends to remove it ... even though it is
needed to continue with development

---END---

The original IRC interview was made in english: as we didn't want to
change anything said by Rik, we didn't correct even grammatical or
syntactical errors. I even left emoticons as they were typed...


Best regards,
Andrea Scrimieri

2002-01-17 01:17:06

by Erik Mouw

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Wed, Jan 16, 2002 at 04:25:45PM -0800, J Sloan wrote:
> Nice try, but do the test again with 2.4.18-pre2-aa2 -
>
> 2.4.17 doesn't neccesarily have all andrea's fixes.

Nice try, but 2.4.17-rmap-11a doesn't have all Rik's fixes either
(rmap-11b is available, but I don't feel like running a new kernel
every few days).

I've been running a couple of 2.4.17-pre kernels on my laptop (which is
my primary machine), but each time they made me switch back to good old
2.4.13-ac5 simply because its VM (read: Rik's VM) was much smoother.

It's not that I think Andrea's VM is bad, it's just that a VM should be
tuned for the common cases, not for the power users that want to
squeeze every last drop out of it. It's fine with me if somebody wants
to design a VM for the niche XYZ, but do that as a separate patch and
don't clutter up the mainline kernel with it.


Erik
[sleep(7*3600);]

--
J.A.K. (Erik) Mouw, Information and Communication Theory Group, Faculty
of Information Technology and Systems, Delft University of Technology,
PO BOX 5031, 2600 GA Delft, The Netherlands Phone: +31-15-2783635
Fax: +31-15-2781843 Email: [email protected]
WWW: http://www-ict.its.tudelft.nl/~erik/

2002-01-17 01:23:16

by Randy Hron

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

About running the Andrea VM in 4M:
I booted in single user mode.

bash-2.05a# uname -a
Linux mountain 2.4.18pre2aa2 #1 Wed Jan 9 21:44:03 EST 2002 i586 unknown

bash-2.05a# dmesg| grep mem
Kernel command line: BOOT_IMAGE=2418p2aa2 ro root=1602 console=ttyS1,38400n8 single mem=4m
Memory: 2100k/4096k available (891k kernel code, 1608k reserved, 215k data, 196k init, 0k highmem)
Freeing unused kernel memory: 196k freed

Manually started syslogd, klogd and brought up the lo and eth0 interfaces and started sshd.
I can ssh into the box:

mountain:/$ ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.4 0.0 1288 0 ? SW 19:24 0:06 init [S]
root 2 0.0 0.0 0 0 ? SW 19:24 0:00 [keventd]
root 3 0.0 0.0 0 0 ? SWN 19:24 0:00 [ksoftirqd_CPU0]
root 4 0.1 0.0 0 0 ? SW 19:24 0:02 [kswapd]
root 5 0.0 0.0 0 0 ? SW 19:24 0:00 [bdflush]
root 6 0.0 0.0 0 0 ? SW 19:24 0:00 [kupdated]
root 7 0.0 0.0 0 0 ? SW 19:24 0:00 [kreiserfsd]
root 21 0.0 0.0 1288 0 ttyS1 SW 19:25 0:00 init [S]
root 22 0.0 0.0 2212 0 ttyS1 SW 19:25 0:00 bash
root 46 0.0 0.0 1444 0 ? SW 19:37 0:00 /usr/sbin/syslogd -m0
root 49 0.0 0.0 1292 0 ? SW 19:38 0:00 /usr/sbin/klogd -c3 -x -k /boot/System.map-2.4.18pre2aa2
root 63 0.0 0.0 0 0 ? SW 19:47 0:00 [eth0]
root 68 0.5 0.0 2672 0 ? SW 19:47 0:00 sshd
root 69 0.9 0.8 2756 20 ? D 19:48 0:00 sshd
hrandoz 70 1.9 0.0 2192 0 pts/0 SW 19:48 0:01 -bash
hrandoz 75 40.0 4.1 2504 96 pts/0 R 19:49 0:00 ps aux

mountain:/$ cat /proc/meminfo
total: used: free: shared: buffers: cached:
Mem: 2351104 2232320 118784 0 344064 249856
Swap: 156270592 1409024 154861568
MemTotal: 2296 kB
MemFree: 116 kB
MemShared: 0 kB
Buffers: 336 kB
Cached: 204 kB
SwapCached: 40 kB
Active: 68 kB
Inactive: 516 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 2296 kB
LowFree: 116 kB
SwapTotal: 152608 kB
SwapFree: 151232 kB

hrandoz@mountain:/$ /sbin/swapon -s
Filename Type Size Used Priority
/dev/hda3 partition 152608 1372 -1


top:
7:55pm up 31 min, 1 user, load average: 1.30, 0.71, 0.32
16 processes: 15 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: 0.1% user, 4.6% system, 0.0% nice, 95.1% idle
Mem: 2296K av, 2096K used, 200K free, 0K shrd, 316K buff
Swap: 152608K av, 1488K used, 151120K free 96K cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
79 hrandoz 14 0 316 200 184 R 2.0 8.7 0:02 top
69 root 18 0 352 0 0 SW 1.9 0.0 0:03 sshd
1 root 20 0 56 0 0 SW 0.0 0.0 0:06 init
2 root 20 0 0 0 0 SW 0.0 0.0 0:00 keventd
3 root 20 19 0 0 0 SWN 0.0 0.0 0:00 ksoftirqd_CPU0
4 root 20 0 0 0 0 SW 0.0 0.0 0:03 kswapd
5 root 3 0 0 0 0 SW 0.0 0.0 0:00 bdflush
6 root 20 0 0 0 0 SW 0.0 0.0 0:00 kupdated
7 root 20 0 0 0 0 SW 0.0 0.0 0:00 kreiserfsd
21 root 9 0 56 0 0 SW 0.0 0.0 0:00 init
22 root 20 0 296 0 0 SW 0.0 0.0 0:00 bash
46 root 20 0 108 0 0 SW 0.0 0.0 0:00 syslogd
49 root 20 0 72 0 0 SW 0.0 0.0 0:00 klogd
63 root 20 0 0 0 0 SW 0.0 0.0 0:00 eth0
68 root 20 0 276 0 0 SW 0.0 0.0 0:00 sshd
70 hrandoz 20 0 276 0 0 SW 0.0 0.0 0:01 bash

bash-2.05a# df -kT
Filesystem Type 1k-blocks Used Available Use% Mounted on
/dev/hdc2 reiserfs 9768728 4194328 5574400 43% /
/dev/hdc3 ext2 8064432 4912 7649872 1% /opt/testing


Running 4M is interesting. If I really wanted to run 4M, I'd use ash
for the shell, and try uClibc instead of glibc.

The test that really impresses me the most about 2.4.18pre2aa2 is what
I see on this test on an Athlon 1333 with 1024MB RAM:

simultaneously:
run continuous loop of mtest01 -p 85 -w # allocate and write to 85% of VM
create and cpio 10 330 MB files
nice -19 setiathome &
listen to mp3blaster # a few skips at the beginning, then smooth.

I've tested a bunch of kernels lately. This is only what's sitting in /boot
since I last cleaned it out:

mountain:/boot$ ls vml*
vmlinuz vmlinuz-2.4.17rc2aa2-old vmlinuz-2.4.18-pre3
vmlinuz-2.4.18pre2 vmlinuz-2.4.18pre3ll vmlinuz-2.5.1-dj11
vmlinuz-2.5.2-pre10 vmlinuz-2.5.2-pre9 vmlinuz-2.4.17
vmlinuz-2.4.17rc2aa2-wli vmlinuz-2.4.18-pre4 vmlinuz-2.4.18pre2aa1
vmlinuz-2.4.18pre3pe vmlinuz-2.5.1-dj13 vmlinuz-2.5.2-pre11
vmlinuz-2.5.2-pre9mingo vmlinuz-2.4.17-rmap11a vmlinuz-2.4.17rmap11b
vmlinuz-2.4.18pre1-mjc2nio vmlinuz-2.4.18pre2aa2 vmlinuz-2.4.18pre3pelb
vmlinuz-2.5.1-dj14 vmlinuz-2.5.2-pre5 vmlinuz.old
vmlinuz-2.4.17rc2aa2 vmlinuz-2.4.18-pre1 vmlinuz-2.4.18pre1mjc2
vmlinuz-2.4.18pre3-ac2 vmlinuz-2.5.1 vmlinuz-2.5.2
vmlinuz-2.5.2-pre6-mingo

IMHO, 2.4.18pre2aa2 is the best!

Some kernel test results at:
http://home.earthlink.net/~rwhron/kernel/k6-2-475.html

# from the 4M box
mountain:/boot$ uptime
8:24pm up 1:00, 1 user, load average: 0.08, 0.04, 0.09

--
Randy Hron

2002-01-17 01:33:06

by Kallol Biswas

[permalink] [raw]
Subject: C source lines for assembly listing

Hi,
Does gcc have an option to list the C source line information for
assembly instructions?

Kallol

2002-01-17 01:51:59

by Nicolas Pitre

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Wed, 16 Jan 2002, Rik van Riel wrote:

> 1) the journalist may be good, but his english was far from
> fluent, there was some confusion at times during the
> interview
>
> 2) the interview was done quite informally on IRC, with me
> replying in normal IRC style ... it appears however
> that the sentence fragments were just cut'n'pasted
> together into something gramatically dubious, this has
> messed up the contents in some places ;)
>
> 3) I guess this whole stuff was converted to gramatically
> correct Italian, possibly meaning something slightly
> different from the text in (2), definately something
> else than what I wanted to say.

Then...

What did you want to say exactly?

Why aren't you rushing out to provide corrections?

What are you waiting for?


Nicolas

2002-01-17 01:53:29

by Stephen Satchell

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

At 10:38 PM 1/16/02 -0200, Rik van Riel wrote:
>I guess this is the last time I'm giving an interview to a
>journalist who isn't fluent in any of the languages I'm
>fluent in ;)

As a journalist I would also say that you shouldn't use IRC for the
interview medium. I once tried to do an interview using the "talk"
facility on CompuServe and capture the conversation to floppy disk. (Easy
for me to do with Professional YAM.) I ended up repeating the interview in
electronic mail because when I fact-checked the interview it turned out
that the guy didn't stick to the same story throughout the interview. The
e-mail interview was much more coherent.

Some people just don't think well on their feet. Some reporters try to
take advantage of it.

Satch

2002-01-17 02:15:10

by Andrea Scrimieri

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Wed, 16 Jan 2002, Rik van Riel wrote:

> 1) the journalist may be good, but his english was far from
> fluent, there was some confusion at times during the
> interview

This email is just to let you all know that the original interview was the
english one. I published it as it was in my IRC log, without correcting
any grammatical/syntactical error, removing emoticons, changing any word.



> 2) the interview was done quite informally on IRC, with me
> replying in normal IRC style ... it appears however
> that the sentence fragments were just cut'n'pasted
> together into something gramatically dubious, this has
> messed up the contents in some places ;)

Rik, I asked you to choose between an email or IRC interview, you chose
IRC, so from that moment if you used an informal language, it was out
of my job. If you aren't able to take your responsabilities for your
actions or words, this probably means you're not enough mature to be a
maintainer. If you are sure to be a victim, publish your logs, i'll be
happy to publish mine.



> 3) I guess this whole stuff was converted to gramatically
> correct Italian, possibly meaning something slightly
> different from the text in (2), definately something
> else than what I wanted to say.

The interview was translated to italian by a highly qualified person.
Anyway, both italian and english interviews are on the web, i'm sure
Andrea, or whoever, will be happy to tell you if anything was changed.


Best regards,
Andrea Scrimieri

2002-01-17 03:58:40

by Randy Hron

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM


A couple points to followup on the notion of VM at mem=4m:

2.4.18pre2aa2 can do it:
http://marc.theaimsgroup.com/?l=linux-kernel&m=101123070310781&w=2

2.5.2, 2.4.18-pre4, and 2.4.18-pre3-rmap11b would not allow login
with boot single mem=4m:

2.4.18-pre3-rmap11b tried with init=/bin/bash and init=/bin/ash,
but that would not produce a prompt either.

Log at:
http://home.earthlink.net/~rwhron/kernel/4m

BTW, I think 4G is more important than 4M.

--
Randy Hron

2002-01-17 07:45:13

by Luigi Genoni

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

Excuse me, I write as a journalist on a magazine too,
and so I have a little esperience about those things.
Before the interview was published, did you read it for final
permission to publish it?

It is quite logical that some pieces of a speech
are cutted out in an interview if they are considered
not interesting (theer are also space limits, you know;) ), and
that is why the guy who has been interviewed should
do a finial vidimation to be sure that what is written is
espressing exaclty his tought.

Luigi


On Wed, 16 Jan 2002, Rik van Riel wrote:

> Date: Wed, 16 Jan 2002 22:31:26 -0200 (BRST)
> From: Rik van Riel <[email protected]>
> To: V-man <[email protected]>
> Cc: [email protected], [email protected]
> Subject: Re: Rik spreading bullshit about VM
>
> On Thu, 17 Jan 2002, V-man wrote:
>
> > So basically Ri's assertion is far from truth on many aspects.
>
> That assumes it is my assertion, it appears the journalist in
> question is missing a few lines from his IRC log though...
>
> Rik
> --
> "Linux holds advantages over the single-vendor commercial OS"
> -- Microsoft's "Competing with Linux" document
>
> http://www.surriel.com/ http://distro.conectiva.com/
>

2002-01-17 08:19:25

by Christoph Rohland

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

Hi Chris,

On Wed, 16 Jan 2002, Chris Chabot wrote:
>
>> Test hardware:
>> 4 way Dell, 4 GB physical RAM, SCSI/RAID subsystem,
>> DB runs on FS.
>
> Can we first make sure that the other factors dont plat a rol in
> this benchmark? I have a couple (14+) Dell servers here, and i know
> for a fact that most of their RAID systems are heavely borked in the
> performance department.

We run these tests regularly on different hardware - We have hardware
from Compaq, Dell, HP, IBM and FSC.

All tests so far showed the same overall performance tendencies.

Greetings
Christoph


2002-01-17 11:11:56

by Rik van Riel

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Thu, 17 Jan 2002, Luigi Genoni wrote:

> Excuse me, I write as a journalist on a magazine too,
> and so I have a little esperience about those things.
> Before the interview was published, did you read it for final
> permission to publish it?

No, I didn't get offered to see the interview before it was published.

regards,

Rik
--
"Linux holds advantages over the single-vendor commercial OS"
-- Microsoft's "Competing with Linux" document

http://www.surriel.com/ http://distro.conectiva.com/

2002-01-17 11:46:10

by Rik van Riel

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Wed, 16 Jan 2002, Nicolas Pitre wrote:

> Then...
>
> What did you want to say exactly?
>
> Why aren't you rushing out to provide corrections?
>
> What are you waiting for?

It's just an article, it's just this week.

Next week people will be all upset about the next
block io layer flamewar or maybe the politicians
will have time again for the next round of DMCA.

I'm not a good writer, by the time I'd have written
any correction or follow-up people would have mostly
forgotten this thing anyway.

regards,

Rik
--
"Linux holds advantages over the single-vendor commercial OS"
-- Microsoft's "Competing with Linux" document

http://www.surriel.com/ http://distro.conectiva.com/

2002-01-17 12:02:53

by Stephan von Krawczynski

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Thu, 17 Jan 2002 09:45:24 -0200 (BRST)
Rik van Riel <[email protected]> wrote:

> On Wed, 16 Jan 2002, Nicolas Pitre wrote:
>
> > Then...
> >
> > What did you want to say exactly?
> >
> > Why aren't you rushing out to provide corrections?
> >
> > What are you waiting for?
>
> It's just an article, it's just this week.
>
> Next week people will be all upset about the next
> block io layer flamewar or maybe the politicians
> will have time again for the next round of DMCA.
>
> I'm not a good writer, by the time I'd have written
> any correction or follow-up people would have mostly
> forgotten this thing anyway.

Wrong century, Rik.
Google is your friend.

But anyway, hope we all have learned something.

Regards,
Stephan


2002-01-17 12:07:34

by Rik van Riel

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Thu, 17 Jan 2002, Andrea Scrimieri wrote:

> > 2) the interview was done quite informally on IRC, with me
> > replying in normal IRC style ... it appears however
> > that the sentence fragments were just cut'n'pasted
> > together into something gramatically dubious, this has
> > messed up the contents in some places ;)
>
> Rik, I asked you to choose between an email or IRC interview, you chose
> IRC, so from that moment if you used an informal language, it was out
> of my job. If you aren't able to take your responsabilities for your
> actions or words,

I'm willing to take full responsability for what I mean,
not for other people's interpretations of my words.

It seems IRC isn't a good medium to do interviews so I
won't do that again.

regards,

Rik
--
"Linux holds advantages over the single-vendor commercial OS"
-- Microsoft's "Competing with Linux" document

http://www.surriel.com/ http://distro.conectiva.com/

2002-01-17 13:11:26

by Andrea Scrimieri

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Thu, 17 Jan 2002, Rik van Riel wrote:

> On Thu, 17 Jan 2002, Andrea Scrimieri wrote:
>
> > > 2) the interview was done quite informally on IRC, with me
> > > replying in normal IRC style ... it appears however
> > > that the sentence fragments were just cut'n'pasted
> > > together into something gramatically dubious, this has
> > > messed up the contents in some places ;)
> >
> > Rik, I asked you to choose between an email or IRC interview, you chose
> > IRC, so from that moment if you used an informal language, it was out
> > of my job. If you aren't able to take your responsabilities for your
> > actions or words,
>
> I'm willing to take full responsability for what I mean,
> not for other people's interpretations of my words.

For the last time, i didn't change the interview text or cut anything. If
you want to continue throwing shit to me because you are not responsible
of your words, please publish the logs. I'm beginning to get tired of
sentences like: "That assumes it is my assertion, it appears the
journalist in question is missing a few lines from his IRC log though...".
That's not true.

If you saw some differences beetween my version an yours, why didn't you
tell me that? Maybe because in reality there were not? This controversy is
started almost 2 days after the publishing of the article. If lies or
inaccuracies would be written on it you'd have all the time you needed to
tell me. You started to affirm that there were cuts only after the first
Andrea's email. Please be fair and get your own resposibilities.


>
> It seems IRC isn't a good medium to do interviews so I
> won't do that again.

You chose IRC, I don't.

Regards,

Andrea

2002-01-17 13:14:42

by Alan

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

> If redhat doesn't use the -aa VM into their kernels that's either a
> political decision or they're not good enough at the VM. I can tell you

If you want to insult the Red Hat people please don't do it from a SuSE
address. There are some great people at SuSE and I somehow doubt you speak
for the management or major stockholders (ibm etc)

When Red Hat shipped 7.2 the -aa vm didn't even exist. It was 2.4.7 era -
so the choices were Rik's stuff half missing and mangled by Linus versus
Riks stuff proper (2.4.7-ac). The latter passed QA the formed choked and died.
The same basically applied for the 2.4.9 based errata - combined with
a desire to reduce unneeded change, because paying corporate customers want
certainly not neat toys. Shipping 2.4.10 to customers would have been
pretty irresponsible when it seems that in some cases even things like
fsync() didnt actually work. The O_DIRECT security bug with /dev alone I
think justified that caution.

When we tested 2.4.17 during evaluation we found it 20% slower on many I/O
heavy workloads. I don't know if -aa ever got that far in QA testing.
I do know 2.4.17+rmap11* passes Cerberus.

>From my own testing 2.4.18pre3-aa does pretty well, its better than the
2.4.17 base until you get high loads then it gets ugly. Clearly it has a
lot of things right.

At the moment both 2.4.18pre+rmap and -aa are better than base 2.4.18pre3,
so its in your interest to actually send Marcelo the changes one at a time
with explanations to get 2.4.18 better and better, and with luck find which
change is causing the horribly heavy swap behaviour and "0 order allocation
failed" cases along the way.

Alan

2002-01-17 13:17:32

by Rik van Riel

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Thu, 17 Jan 2002, Andrea Scrimieri wrote:

> > I'm willing to take full responsability for what I mean,
> > not for other people's interpretations of my words.
>
> For the last time, i didn't change the interview text or cut anything.
> If you want to continue throwing shit to me because you are not

Which part of "to interprete" do you not understand ?

Interpreting my text is reading it and building a
meaning for the words in your head. It doesn't mean
the text needs to be changed ... the meaning can be
different for each person reading the exact same
words.

I guess this is why politicians are very careful to
not attach any meaning at all to their words, however
I'm a programmer, not a politician ;)

> > It seems IRC isn't a good medium to do interviews so I
> > won't do that again.
>
> You chose IRC, I don't.

And that was my mistake, indeed.

I think you're reading things into my words that I
didn't want to put there.

regards,

Rik
--
"Linux holds advantages over the single-vendor commercial OS"
-- Microsoft's "Competing with Linux" document

http://www.surriel.com/ http://distro.conectiva.com/

2002-01-17 13:42:48

by Ian Soboroff

[permalink] [raw]
Subject: Re: [lkml] Re: Rik spreading bullshit about VM

Rik van Riel <[email protected]> writes:

> It seems the IRC log of the journalist in question was missing
> some lines of what I said and he just glued together the remaining
> parts of the paragraph for that particular question ;)

It also looked to me like Rik was talking about the mainline VM, not
any VM that exists in the many not-yet-integrated -aa patchsets.

The argument always feels to me like:

Joe Luser: "I downloaded a 2.4 kernel and boy does the VM bite!"

AA: "That's not true at all, but you need to apply these thirteen -aa
patches from my site."

Joe Luser: <runs away in fear>

ian

2002-01-17 14:14:11

by Andrea Arcangeli

[permalink] [raw]
Subject: blkdev speedup

On Wed, Jan 16, 2002 at 11:44:37PM +0100, Chris Chabot wrote:
>
> > Test hardware:
> > 4 way Dell, 4 GB physical RAM, SCSI/RAID subsystem,
> > DB runs on FS.
>
> Can we first make sure that the other factors dont plat a rol in this
> benchmark? I have a couple (14+) Dell servers here, and i know for a
> fact that most of their RAID systems are heavely borked in the
> performance department.
>
> All kernels upto 2.4.1x performed horibly, and all kernels after 2.4.16
> or so perform horibly again! Somewhere inbetween some magic seemed to
> happen in the block layer / elevator code / etc, that caused performance
> to increase upto 100% on the Dell PERC adapters. (started @ the first
> release of the AA VM). However after a few small releases, the

if you're using the blkdev directly, then please try to mount the blkdev
with a 4k filesystem before making your benchmark, that should give you
the magic performance back. 2.4.10 intentionally were defaulting to 4k
I/O, this is probably what made the difference for you.

Andrea

2002-01-17 14:17:31

by Andrea Arcangeli

[permalink] [raw]
Subject: async buffer flushing reported slowdown (could be a driver issue?)

On Wed, Jan 16, 2002 at 04:29:54PM -0500, Adam Kropelin wrote:
> Andrea Arcangeli wrote:
> <snip>
> >I don't have a single bugreport about the current 2.4.18pre2aa2 VM (except
> >perhaps the bdflush wakeup that seems to be a little too late and that deals to
> >lower numbers with slow write load etc.., fixable with bdflush tuning).
>
> I don't know if this is a reference to the issue I reported under the "Writeout in
> recent kernels..." thread or not. If not, my apologies for clogging up this new
> "discussion".

yes, I was thinking about you report.

>
> As reported[0] in the above-mentioned thread, the bdflush tuning parameters
> you suggested made no difference in my test case other than slightly adjusting
> the temporal relationship between writeout and file transfer. -aa still performs
> slightly worse than both 2.4.17 stock and -rmap. 2.4.13-ac7 currently beats
> all competitors.

Then can you verify the bandwith you get out of the network card is the
same across 2.4.13-ac7 and all the other kernels you are trying. Also
please check with an hdparm -t the speed you get out of IDE is the same.
This sounds like some driver changed (note that -ac is used to queue
lots of driver updates) and that made the difference. Otherwise if we
wakeup bdflush early enough I don't see why it takes more time.

Andrea

2002-01-17 14:20:31

by Alan

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

> directory and the CVS root are on the same disk. With 2.4.17 the mp3
> player stutters, I can't even read email or edit a couple of files with
> XEmacs at the same time. With 2.4.17-rmap-11a the mp3 player runs
> smoothly and email and XEmacs are usable again.

Try 2.4.17-aa as well - the -aa stuff generally behaves better than 2.4.17
base

2002-01-17 14:21:52

by Alan

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

> And that was my mistake, indeed.
>
> I think you're reading things into my words that I
> didn't want to put there.

Rik - bury this for good. Tell the world what you did mean to port there

2002-01-17 14:25:01

by Andrea Arcangeli

[permalink] [raw]
Subject: oom failures with mem=4m

On Wed, Jan 16, 2002 at 10:58:45PM +0100, Diego Calleja wrote:
> On Wed, 16 Jan 2002, Andrea Arcangeli wrote:
> > attached) and most important I don't have a single bugreport about the
> > current 2.4.18pre2aa2 VM (except perhaps the bdflush wakeup that seems
> > to be a little too late and that deals to lower numbers with slow write
> > load etc.., fixable with bdflush tuning). Mainline VM kills too easily,
>
> Well, I haven't reported it yet, but booting my box with mem=4M
> gave as result: (running 2.4.18-pre2aa2):
> diego# cat /var/log/messages | grep gfp
> Jan 13 15:37:10 localhost kernel: __alloc_pages: 0-order allocation failed
> (gfp=0xf0/0)
> Jan 15 16:06:28 localhost kernel: __alloc_pages: 0-order allocation failed
> (gfp=0xf0/0)
> Jan 15 18:37:21 localhost kernel: __alloc_pages: 0-order allocation failed
> (gfp=0xf0/0)
> Jan 15 21:58:32 localhost kernel: __alloc_pages: 0-order allocation failed
> (gfp=0xf0/0)
> Jan 15 21:58:33 localhost kernel: __alloc_pages: 0-order allocation failed
> (gfp=0xf0/0)
> diego#

0xf0 shouldn't lead to an oom killing, there should be some other failure
before the killing. The above are normal warnings, they're KERN_NOTICE,
not KERN_WARNING nor KERN_ERROR.

>
> Each script of /etc/rc.d was killed by VM when it was started, there wasn't
> any "OOM", just
> "VM killed..." or something similar.

That means there wasn't enough memory, sounds like your bootup
sequence is broken and startup something big before activating swap,
either that or you start something that takes more than 16+4m, note that
with any recent distribution it is very easy that you need 16+4 after a
little time after boot.

If you could provide a vmstat trace during the VM killing, that could
show better if the VM is the culprit or if it did the right thing.

I know for experience at the first VM killing people tends to point
the finger at the VM (me too sometime at first, see the pte-highmem
thread) but at least in my tree that never turned out to be the case
yet.

> As /etc/rc.d scripts were killed, I couldn't start swap.

Can you try to boot with emergency, then activate swap, and then check
if it runs oom again despite lots of free swap available etc...? thanks,

>
> The gfp=0x... numbers were not always the same, but I can't remember them
> because syslogd wasn't running.
> I can repeat this if you want and I'll copy all messages.
>
> ..I remember running 2.2.14 in a 386 box with 4MB of RAM and 8 or 16 of
> swap. It was veeery slow, but even I could run apache :-)...

:)

Andrea

2002-01-17 14:34:51

by Andrea Arcangeli

[permalink] [raw]
Subject: bugfix backed out

On Wed, Jan 16, 2002 at 08:02:42PM -0200, Rik van Riel wrote:
> On Wed, 16 Jan 2002, Diego Calleja wrote:
> > On Wed, 16 Jan 2002, Andrea Arcangeli wrote:
> > > attached) and most important I don't have a single bugreport about the
> > > current 2.4.18pre2aa2 VM (except perhaps the bdflush wakeup that seems
> >
> > Well, I haven't reported it yet, but booting my box with mem=4M
> > gave as result: (running 2.4.18-pre2aa2):
> > diego# cat /var/log/messages | grep gfp
> > Jan 13 15:37:10 localhost kernel: __alloc_pages: 0-order allocation failed
> > (gfp=0xf0/0)
>
> > Each script of /etc/rc.d was killed by VM when it was started,
>
> It seems Andrea's patch backs out a bugfix for this problem
> which marcelo and me put into the normal 2.4 kernel ...

hmm, is this the bugfix you mean? that shouldn't really matter to me as
far I can tell, I did it in an alternate way since the first place.

diff -urN 2.4.17pre8/mm/vmscan.c 2.4.17/mm/vmscan.c
--- 2.4.17pre8/mm/vmscan.c Fri Nov 23 08:21:05 2001
+++ 2.4.17/mm/vmscan.c Fri Dec 21 20:06:55 2001
@@ -338,7 +338,7 @@
{
struct list_head * entry;
int max_scan = nr_inactive_pages / priority;
- int max_mapped = nr_pages << (9 - priority);
+ int max_mapped = min((nr_pages << (10 - priority)), max_scan / 10);

spin_lock(&pagemap_lru_lock);
while (--max_scan >= 0 && (entry = inactive_list.prev) != &inactive_list) {

furthmore I hate those "10" hardwirded magic numbers that you keep
adding. The less of them the better. At least I put those magics in
sysctl.

see what my max_mapped is:

int orig_max_mapped = SWAP_CLUSTER_MAX * vm_mapped_ratio,

It is controlled by the vm_mapped_ratio and by the swap-cluster. So we
unmap one swap cluster at every vm_mapped_ratio of pages scanned that
were mapped. This ensure we unmap when there's some relevant work to do.
The lower the vm_mapped_ratio, the earlier the kernel will start
swapping/paging. (ah, and of course also the SWAP_CLUSTER_MAX would
better be a sysctl but it isn't yet)

Andrea

2002-01-17 15:04:47

by Rik van Riel

[permalink] [raw]
Subject: Re: bugfix backed out

On Thu, 17 Jan 2002, Andrea Arcangeli wrote:

> hmm, is this the bugfix you mean? that shouldn't really matter to me as
> far I can tell, I did it in an alternate way since the first place.

It matters a lot since without this change max_mapped
will always be larger than max_scan and swap_out() will
NEVER be called.

If this is fixed in another way in -aa I must have missed
that piece of code, I only stared at the patch for about
10 minutes before writing this email.

> diff -urN 2.4.17pre8/mm/vmscan.c 2.4.17/mm/vmscan.c
> --- 2.4.17pre8/mm/vmscan.c Fri Nov 23 08:21:05 2001
> +++ 2.4.17/mm/vmscan.c Fri Dec 21 20:06:55 2001
> @@ -338,7 +338,7 @@
> {
> struct list_head * entry;
> int max_scan = nr_inactive_pages / priority;
> - int max_mapped = nr_pages << (9 - priority);
> + int max_mapped = min((nr_pages << (10 - priority)), max_scan / 10);
>
> spin_lock(&pagemap_lru_lock);
> while (--max_scan >= 0 && (entry = inactive_list.prev) != &inactive_list) {
>
> furthmore I hate those "10" hardwirded magic numbers that you keep
> adding. The less of them the better. At least I put those magics in
> sysctl.

Absolutely agreed ... if it helps you, it was marcelo who
changed the 9 to 10 ;)

Ideally we'd have a VM which runs ok without magic numbers,
or at least one where changing the magic numbers has extremely
little influence, the defaults work and the sysctl switches
don't require you to learn how all the VM internals work.

> see what my max_mapped is:
>
> int orig_max_mapped = SWAP_CLUSTER_MAX * vm_mapped_ratio,
>
> It is controlled by the vm_mapped_ratio and by the swap-cluster. So we
> unmap one swap cluster at every vm_mapped_ratio of pages scanned that
> were mapped. This ensure we unmap when there's some relevant work to do.
> The lower the vm_mapped_ratio, the earlier the kernel will start
> swapping/paging. (ah, and of course also the SWAP_CLUSTER_MAX would
> better be a sysctl but it isn't yet)

Yes, but what happens when orig_max_mapped gets larger than
max_scan ? How does the -aa VM protect against that ?

regards,

Rik
--
"Linux holds advantages over the single-vendor commercial OS"
-- Microsoft's "Competing with Linux" document

http://www.surriel.com/ http://distro.conectiva.com/

2002-01-17 15:11:07

by Andrea Arcangeli

[permalink] [raw]
Subject: clarification about redhat and vm

On Thu, Jan 17, 2002 at 01:26:07PM +0000, Alan Cox wrote:
> > If redhat doesn't use the -aa VM into their kernels that's either a
> > political decision or they're not good enough at the VM. I can tell you
>
> If you want to insult the Red Hat people please don't do it from a SuSE
> address. There are some great people at SuSE and I somehow doubt you speak
> for the management or major stockholders (ibm etc)

do you plan to sue me as well? :)

"If redhat doesn't use the -aa VM " was a short form of "if redhat
cannot see the goodness of all the bugfixing work that happened between
the 2.4.9 VM and any current branch 2.4, and so if they keep shipping
2.4.9 VM as the best one for DBMS and critical VM apps like the SAP
benchmark".

I think it's fair enough to say that if you plan to keep shipping 2.4.9
VM with all its troubles like I understood yesterday (starting from VM
highmem deadlocks, to kswapd looping into ZONE_DMA etc..., swap storms
throwing the realistic SAP benchmark to /dev/null) that was not usable
on long uptimes on big DBMS with several gigabytes of ram.

Somebody else also complained me about this saying that from what I said
it looks like the -aa VM is the best thing possible which is obviously
not true. In such two lines I said -aa VM just to go short. The -aa VM
in 2.4.18pre2aa2 is obviously certainly not the best that you can make
and I suggest everybody to try to make things better and invent and try
new algorithm etc... that is just the best compromise that _I_ could
make so far. So it is obvious if anybody doesn't use the -aa VM in
2.4.18pre2aa2 it doesn't mean he doesn't understand about VM. as far I
can tell rmap could be an order of magnitude better of -aa VM in
2.4.18pre2aa2, it's just I didn't checked it yet (because of all the non
obvious implication the rmap design adds, see DaveM emails of one year
back to linux-mm). All my wondering in my previous email was between
2.4.9 VM with all its known troubles and a sane version of the current
vm like in 2.4.18pre2aa2. So about the past and the present, not about
the present and the future. I thought it was obvious from the context of
the email. I said this in two lines and apparently RedHat didn't like
it, I'm sorry, but quite frankly I think that was quite fair enough, at
least with this additional clarification added.

Andrea

2002-01-17 15:22:29

by Rik van Riel

[permalink] [raw]
Subject: Re: clarification about redhat and vm

On Thu, 17 Jan 2002, Andrea Arcangeli wrote:
> On Thu, Jan 17, 2002 at 01:26:07PM +0000, Alan Cox wrote:
> > > If redhat doesn't use the -aa VM into their kernels that's either a
> > > political decision or they're not good enough at the VM. I can tell you
> >
> > If you want to insult the Red Hat people please don't do it from a SuSE
> > address. There are some great people at SuSE and I somehow doubt you speak
> > for the management or major stockholders (ibm etc)
>
> do you plan to sue me as well? :)
>
> "If redhat doesn't use the -aa VM " was a short form of "if redhat
> cannot see the goodness of all the bugfixing work that happened between
> the 2.4.9 VM and any current branch 2.4, and so if they keep shipping
> 2.4.9 VM as the best one for DBMS and critical VM apps like the SAP
> benchmark".

Redhat's 2.4.9 is about as close to 2.4.9 as your 2.4.18-aa is
to 2.4.17.

If you want to judge Redhat by vanilla 2.4.9, I guess we should
start judging -aa based on measuring 2.4.17 ;)

regards,

Rik
--
"Linux holds advantages over the single-vendor commercial OS"
-- Microsoft's "Competing with Linux" document

http://www.surriel.com/ http://distro.conectiva.com/

2002-01-17 15:26:29

by John Jasen

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Wed, 16 Jan 2002, Rik van Riel wrote:

> On Thu, 17 Jan 2002, V-man wrote:
>
> > So basically Ri's assertion is far from truth on many aspects.
>
> That assumes it is my assertion, it appears the journalist in
> question is missing a few lines from his IRC log though...

Then, in the interests of ending this flamewar quickly, why don't you
email clarifications to the author, cc'ing Andreas, lkml, and letting the
slashdotter children know of it?

--
-- John E. Jasen ([email protected])
-- In theory, theory and practise are the same. In practise, they aren't.

2002-01-17 15:52:19

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: bugfix backed out

On Thu, Jan 17, 2002 at 01:04:07PM -0200, Rik van Riel wrote:
> On Thu, 17 Jan 2002, Andrea Arcangeli wrote:
>
> > hmm, is this the bugfix you mean? that shouldn't really matter to me as
> > far I can tell, I did it in an alternate way since the first place.
>
> It matters a lot since without this change max_mapped
> will always be larger than max_scan and swap_out() will
> NEVER be called.

I see, thanks for the explanation.

> If this is fixed in another way in -aa I must have missed

yes, -aa handles differently the case where max_scan expires.

> > diff -urN 2.4.17pre8/mm/vmscan.c 2.4.17/mm/vmscan.c
> > --- 2.4.17pre8/mm/vmscan.c Fri Nov 23 08:21:05 2001
> > +++ 2.4.17/mm/vmscan.c Fri Dec 21 20:06:55 2001
> > @@ -338,7 +338,7 @@
> > {
> > struct list_head * entry;
> > int max_scan = nr_inactive_pages / priority;
> > - int max_mapped = nr_pages << (9 - priority);
> > + int max_mapped = min((nr_pages << (10 - priority)), max_scan / 10);
> >
> > spin_lock(&pagemap_lru_lock);
> > while (--max_scan >= 0 && (entry = inactive_list.prev) != &inactive_list) {
> >
> > furthmore I hate those "10" hardwirded magic numbers that you keep
> > adding. The less of them the better. At least I put those magics in
> > sysctl.
>
> Absolutely agreed ... if it helps you, it was marcelo who
> changed the 9 to 10 ;)

Never mind :)

> Ideally we'd have a VM which runs ok without magic numbers,
> or at least one where changing the magic numbers has extremely
> little influence, the defaults work and the sysctl switches
> don't require you to learn how all the VM internals work.

100% agreed.

>
> > see what my max_mapped is:
> >
> > int orig_max_mapped = SWAP_CLUSTER_MAX * vm_mapped_ratio,
> >
> > It is controlled by the vm_mapped_ratio and by the swap-cluster. So we
> > unmap one swap cluster at every vm_mapped_ratio of pages scanned that
> > were mapped. This ensure we unmap when there's some relevant work to do.
> > The lower the vm_mapped_ratio, the earlier the kernel will start
> > swapping/paging. (ah, and of course also the SWAP_CLUSTER_MAX would
> > better be a sysctl but it isn't yet)
>
> Yes, but what happens when orig_max_mapped gets larger than
> max_scan ? How does the -aa VM protect against that ?

in short I recall swap_out also from the outside of shrink_caches, if it
fails (if max_scan timeouts it means shrink_caches will fail).

Andrea

2002-01-17 16:05:40

by Alan

[permalink] [raw]
Subject: Re: clarification about redhat and vm

> "If redhat doesn't use the -aa VM " was a short form of "if redhat
> cannot see the goodness of all the bugfixing work that happened between
> the 2.4.9 VM and any current branch 2.4, and so if they keep shipping
> 2.4.9 VM as the best one for DBMS and critical VM apps like the SAP
> benchmark".

The RH VM is totally unrelated to the crap in 2.4.9 vanilla. The SAP comment
begs a question. 2.4.10 seems to have problems remembering to actually
do fsync()'s. How much of your SAP benchmark is from fsync's that dont
happen ? Do you get the same values with 2.4.18-aa ?

Alan

2002-01-17 16:31:32

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: clarification about redhat and vm

On Thu, Jan 17, 2002 at 04:17:21PM +0000, Alan Cox wrote:
> The RH VM is totally unrelated to the crap in 2.4.9 vanilla. The SAP comment
> begs a question. 2.4.10 seems to have problems remembering to actually
> do fsync()'s. How much of your SAP benchmark is from fsync's that dont
> happen ? Do you get the same values with 2.4.18-aa ?

AFIK the bench was not with 2.4.10 (not that I remeber any missing fsync
anyways, actually MS_ASYNC is broken and this is fixed between in
18pre2aa2 from Andrew Morton, but that was broken in 2.4.[79] too). The
bench in 2.2 was delivering much better performance than with 2.4 (I
don't recall the exact number) and 2.2 definitely is not missing fsync
etc... furthmore the 2.2 numbers were reproducible. the benchmark swaps
heavily shm etc... and the 2.4.[79] vm was collapsing at the second pass
(I think first throughput was 5 then 1 1 1 1), if you swapout always the
wrong part and you start trashing because of unbalance of aging it is
very easy to make a x10 difference in the final numbers. I think a sane
vm should run faster than 2.2 and to be reproducible as 2.2. I tend to
like such test, also because it is a real life test (unlike what
somebody thought). The huge regression in such test was one of the main
reasons that made me to realize the brokeness of the vm algorithms.

Andrea

2002-01-17 17:41:38

by Bill Davidsen

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

In article <[email protected]> you write:

| It's not that I think Andrea's VM is bad, it's just that a VM should be
| tuned for the common cases, not for the power users that want to
| squeeze every last drop out of it. It's fine with me if somebody wants
| to design a VM for the niche XYZ, but do that as a separate patch and
| don't clutter up the mainline kernel with it.

I have a few points of disagreement with that. I have no idea what
prices are doing elsewhere, but in the USA memory prices are around
$250-400/GB for memory (from crap to decent ECC) and there are a lot
more machines which live in the "power user" range then there used to
be. And when memory was really cheap, ~$150/GB, many people built big
Athlon systems for small $$. I totally agree that Linux should run in
4MB, but that's not typical anymore.

My real disagreement is that we should be doing worst case analysis on
the VM and scheduler rather than trying to go for best at one thing,
calling that typical, and then letting all other loads take the
leavings. I like to be able to tune, the the kernel should do a decent
job with systems having any reasonable mix of large and small, i/o and
CPU bound jobs. I don't like even the implication that it's okay for
performance to suck, or for new kernels to be worse than 2.4.14 or so.
Alan Cox had some somments on this, and has started the -ac series again
because of it.

So far I find 18pre2aa2 with some setting for bdflush to work
acceptably on several largish machines and one small system with many
processes and working set 3-4x physical memory. More later.

Let's aim for "doesn't suck" instead of "perfect for XXX" and more
people will be satisfied if not delighted.
--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

2002-01-17 19:09:23

by Bill Davidsen

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

In article <[email protected]> you write:
| About running the Andrea VM in 4M:
| I booted in single user mode.
|
| bash-2.05a# uname -a
| Linux mountain 2.4.18pre2aa2 #1 Wed Jan 9 21:44:03 EST 2002 i586 unknown
|
| bash-2.05a# dmesg| grep mem
| Kernel command line: BOOT_IMAGE=2418p2aa2 ro root=1602 console=ttyS1,38400n8 single mem=4m
| Memory: 2100k/4096k available (891k kernel code, 1608k reserved, 215k data, 196k init, 0k highmem)
| Freeing unused kernel memory: 196k freed

[...snip...]

| I've tested a bunch of kernels lately. This is only what's sitting in /boot
| since I last cleaned it out:
|
| mountain:/boot$ ls vml*
| vmlinuz vmlinuz-2.4.17rc2aa2-old vmlinuz-2.4.18-pre3
| vmlinuz-2.4.18pre2 vmlinuz-2.4.18pre3ll vmlinuz-2.5.1-dj11
| vmlinuz-2.5.2-pre10 vmlinuz-2.5.2-pre9 vmlinuz-2.4.17
| vmlinuz-2.4.17rc2aa2-wli vmlinuz-2.4.18-pre4 vmlinuz-2.4.18pre2aa1
| vmlinuz-2.4.18pre3pe vmlinuz-2.5.1-dj13 vmlinuz-2.5.2-pre11
| vmlinuz-2.5.2-pre9mingo vmlinuz-2.4.17-rmap11a vmlinuz-2.4.17rmap11b
| vmlinuz-2.4.18pre1-mjc2nio vmlinuz-2.4.18pre2aa2 vmlinuz-2.4.18pre3pelb
| vmlinuz-2.5.1-dj14 vmlinuz-2.5.2-pre5 vmlinuz.old
| vmlinuz-2.4.17rc2aa2 vmlinuz-2.4.18-pre1 vmlinuz-2.4.18pre1mjc2
| vmlinuz-2.4.18pre3-ac2 vmlinuz-2.5.1 vmlinuz-2.5.2
| vmlinuz-2.5.2-pre6-mingo
|
| IMHO, 2.4.18pre2aa2 is the best!

Of the current kernels it may well be, I have tried it on two largish
machines and it worked "right," I'm building it on a small and slow
machine to try that. However, I was not happy with the default bdflush
settings, and people who don't see what they want should use the
expanded capabilities of -aa before complaining. Also, the performance
I've seen so far, and I have NOT run a full set of tests, would
indicate that it is slightly better than 2.4.13-acN (N is 5, 7 or 8,
don't have it on this machine).

I am looking forward to testing on many configs, and just for the
extra tuning tools in bdflush I think it will be good on all of them.
As I posted before, I think the best scheduler is the one which doesn't
have "jackpot cases" which produce really bad performance. With a
little tuning I believe -aa is there.

Finally, that said I'm trying a patch of my own to -aa, which is why
I haven't run the full set of tests, there are two things I think will
make it even better, and I am finally getting to understand the code,
little as I wanted to.

I can't disagree with Rik on the VM in the unpatched kernel for
several months. It really was not good, and both the -ac and -aa kernels
were taking aim at that problem. IMHO the changes went in before the
bugs went out. Needless to say Rik should not apply for a job as a
diplomat, but I can't disagree with the existance of a problem. If RH
uses a custom VM he was factual about that, although there may be
several reasons for the choice.

Maybe we could deflect the pissing contest back to technical
discussion now?
--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

2002-01-17 19:50:04

by Rik van Riel

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Thu, 17 Jan 2002, bill davidsen wrote:

> I can't disagree with Rik on the VM in the unpatched kernel for
> several months. It really was not good, and both the -ac and -aa kernels
> were taking aim at that problem. IMHO the changes went in before the
> bugs went out. Needless to say Rik should not apply for a job as a
> diplomat, but I can't disagree with the existance of a problem.

Good thing I'm a programmer, I'm allowed to say ambiguous
stuff on IRC. If I were a diplomat, I'd only be allowed
to unambiguously say nothing.

> If RH uses a custom VM he was factual about that, although there may
> be several reasons for the choice.

I've tried to be as factual as possible; I've also learnt
that articles and IRC really use a different kind of
language ... some of the things I said turned out pretty
ambiguous ;)

> Maybe we could deflect the pissing contest back to technical
> discussion now?

I've released rmap-11c today ;)

cheers,

Rik
--
"Linux holds advantages over the single-vendor commercial OS"
-- Microsoft's "Competing with Linux" document

http://www.surriel.com/ http://distro.conectiva.com/

2002-01-17 20:23:29

by Dan Chen

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

Excellent, Rik!

I've posted a diff for rmap-11c against vanilla 2.4.18-pre4 at
http://www.cs.unc.edu/~chenda/Other/2.4.18-pre4_to_rmap11c.diff

On Thu, Jan 17, 2002 at 05:49:26PM -0200, Rik van Riel wrote:
> > Maybe we could deflect the pissing contest back to technical
> > discussion now?
>
> I've released rmap-11c today ;)

--
Dan Chen [email protected]
GPG key: http://www.unc.edu/~crimsun/pubkey.gpg.asc


Attachments:
(No filename) (430.00 B)
(No filename) (232.00 B)
Download all attachments

2002-01-17 21:41:20

by Trever L. Adams

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM

On Thu, 2002-01-17 at 08:15, Rik van Riel wrote:
> Which part of "to interprete" do you not understand ?
>
> Interpreting my text is reading it and building a
> meaning for the words in your head. It doesn't mean
> the text needs to be changed ... the meaning can be
> different for each person reading the exact same
> words.


To an extant anyone is responsible for how another takes (interprets)
what they say. Culture is that combined collection of knowledge of the
human race. It is your responsibility as a human to be as clear as
possible (within reason).

Were you that clear? It seems not, but I don't know. Just don't say
you have no responsibility for how others take what you say, because
that is just dog crap. No communication or collaboration would ever
take place if we all had that attitude. (It most likely couldn't take
place, let alone wouldn't.)

Trever Adams

2002-01-18 00:29:32

by Adam Kropelin

[permalink] [raw]
Subject: Re: async buffer flushing reported slowdown (could be a driver issue?)

Andrea Arcangeli wrote:
> On Wed, Jan 16, 2002 at 04:29:54PM -0500, Adam Kropelin wrote:
> > Andrea Arcangeli wrote:
> > <snip>
> > >I don't have a single bugreport about the current 2.4.18pre2aa2 VM (except
> > >perhaps the bdflush wakeup that seems to be a little too late and that
deals to
> > >lower numbers with slow write load etc.., fixable with bdflush tuning).
> >
> > As reported[0] in the above-mentioned thread, the bdflush tuning parameters
> > you suggested made no difference in my test case other than slightly
adjusting
> > the temporal relationship between writeout and file transfer. -aa still
performs
> > slightly worse than both 2.4.17 stock and -rmap. 2.4.13-ac7 currently beats
> > all competitors.
>
> Then can you verify the bandwith you get out of the network card is the
> same across 2.4.13-ac7 and all the other kernels you are trying. Also

I'll check that and get back to you.

> please check with an hdparm -t the speed you get out of IDE is the same.

There is no IDE in the system. The destination for the file transfer is on
cpqarray RAID5. Do you have a recommendation for how I test the transfer rate of
that without stressing the VM?

> This sounds like some driver changed (note that -ac is used to queue
> lots of driver updates) and that made the difference. Otherwise if we
> wakeup bdflush early enough I don't see why it takes more time.

One of my original tests[0] was to take the cpqarray update from -ac and bring
it forward to 2.4.17. I saw about 20 sec improvement with that, still not
competitive overall with -ac performance. I'll try doing the same with eepro
driver, which is the NIC I'm using.

--Adam

[0]
http://www.kroptech.com:8300/mailimport/showmsg.php?msg_id=49714&db_name=linux_k
ernel


2002-01-18 01:48:50

by Brian Litzinger

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM


Let us not forget about 'deconstructionism':

deconstructionism -- (a philosophical theory of criticism ...
that seeks to expose deep-seated contradictions in a work
by delving below its surface meaning)

It doesn't even matter what Rik thinks he wrote. The real meaning is
what *we*, the readers, can devolve from his writings.

8-)

- Brian Litzinger <[email protected]>

> On Thu, 2002-01-17 at 08:15, Rik van Riel wrote:
> > Which part of "to interprete" do you not understand ?
> >
> > Interpreting my text is reading it and building a
> > meaning for the words in your head. It doesn't mean
> > the text needs to be changed ... the meaning can be
> > different for each person reading the exact same
> > words.


On Thu, Jan 17, 2002 at 04:41:02PM -0500, Trever L. Adams wrote:
> To an extant anyone is responsible for how another takes (interprets)
> what they say. Culture is that combined collection of knowledge of the
> human race. It is your responsibility as a human to be as clear as
> possible (within reason).
>
> Were you that clear? It seems not, but I don't know. Just don't say
> you have no responsibility for how others take what you say, because
> that is just dog crap. No communication or collaboration would ever
> take place if we all had that attitude. (It most likely couldn't take
> place, let alone wouldn't.)

2002-01-18 03:22:34

by Dan Mann

[permalink] [raw]
Subject: ...Re: Rik spreading bullshit about VM

Don't worry about it. Everyone knows (or should know) that you and
Andrea both have the best interest of the kernel vm in mind. It's
normal for both of you to have strong feelings about something that you
both take pride in. What would be great would be seeing both of you
come together and really kick some serious vm ass and prove that linux
can compete with any OS. Maybe you two can come to a compromise along
the lines of "we'll try it your way for a while, and I'll back you, then
we'll try it my way for a while and you back me. We'll go with the one
that works best for Linux."

I'm sure there are hundreds if not thousands of members on this list
that respect you both and believe that either one of you can provide a
great vm.

I'm behind both of you.

my .02

Dan

2002-01-18 04:32:58

by Bosko Radivojevic

[permalink] [raw]
Subject: Re: Rik spreading bullshit about VM


On Thu, 17 Jan 2002, Erik Mouw wrote:

> Some time ago Linus made the important observation that we shouldn't
> tune the scheduler for SMP systems simply because 99.9% of the systems
> in the world running linux have a single CPU. IMHO an equally well
> observation would be that we shouldn't tune the VM for the 0.1% of the
> systems in this world that run large DMBSes. The 99.9% majority is much
> more important.

There is a way to fulfill both needs. If my systems are part of that 0.1%,
I have to disagree with you. :)

There is no way to make one good VM for all possible situations. But, you
can tune/make one VM to work great on large DBMS (e.g.) and tune/make
another one to work great on ordinary desktop systems (playing mp3s & co).
So, add different VMs as kernel-config options. The 'default' one should
be VM for 99.9% users. Everybody happy? :)

Greetings


2002-01-18 04:36:58

by Rik van Riel

[permalink] [raw]
Subject: vm philosophising

On Fri, 18 Jan 2002, Bosko Radivojevic wrote:

> There is no way to make one good VM for all possible situations. But,
> you can tune/make one VM to work great on large DBMS (e.g.) and
> tune/make another one to work great on ordinary desktop systems

This is an interesting assertion ... but up to date nobody has
been able to tell me what exactly should be different between
these two mythical VMs ;)

regards,

Rik
--
"Linux holds advantages over the single-vendor commercial OS"
-- Microsoft's "Competing with Linux" document

http://www.surriel.com/ http://distro.conectiva.com/

2002-01-18 04:58:52

by Matthew Johnson

[permalink] [raw]
Subject: Re: vm philosophising

> This is an interesting assertion ... but up to date nobody has
> been able to tell me what exactly should be different between
> these two mythical VMs ;)
>

Well what is the different requirements for desktop use versus, the server in
terms of virtual memory? What I am doing on my system with MP3's playing,
running Xfree86 on a SuSE 7.3 system will be different from a server running
DMBS's.

Another issue, would it be possible to select one or the other VM's like you
do for example the CPU type?

Kind regards,

Mayy

2002-01-18 05:13:35

by Rik van Riel

[permalink] [raw]
Subject: Re: vm philosophising

On Thu, 17 Jan 2002, Matthew Johnson wrote:

> > This is an interesting assertion ... but up to date nobody has
> > been able to tell me what exactly should be different between
> > these two mythical VMs ;)
>
> Well what is the different requirements for desktop use versus, the
> server in terms of virtual memory?

That's a good question ....

> What I am doing on my system with MP3's playing, running Xfree86 on a
> SuSE 7.3 system will be different from a server running DMBS's.

.... especially because I haven't seen any suggestion on
how these different workloads would get translated into
different VM requirements.

regards,

Rik
--
"Linux holds advantages over the single-vendor commercial OS"
-- Microsoft's "Competing with Linux" document

http://www.surriel.com/ http://distro.conectiva.com/

2002-01-18 05:19:05

by Ryan Cumming

[permalink] [raw]
Subject: Re: vm philosophising

On January 17, 2002 20:58, Matthew Johnson wrote:
> Well what is the different requirements for desktop use versus, the server
> in terms of virtual memory? What I am doing on my system with MP3's
> playing, running Xfree86 on a SuSE 7.3 system will be different from a
> server running DMBS's.

So, your argument basically boils down to "the differences will be that it's,
well, different." Please do us a favour and never join a debating team of any
sort.

-Ryan

2002-01-18 05:43:41

by Matthew Johnson

[permalink] [raw]
Subject: Re: vm philosophising

> So, your argument basically boils down to "the differences will be that
> it's, well, different." Please do us a favour and never join a debating
> team of any sort.
>

Did you get out of the wrong side of the bed or something? Geez, try and help
then get flamed by a troll for your efforts...Plus the fact I was not even
arguing! I was asking questions...

Further, I was postulating to see what the differences (or indeed
similarities) are between the VM's to cope with desktop use vs server use.
No-one seems to know right now.

How does one test the VM precisley? Sorry for the ignorance on this subject.

Kind regards,

Matt
PS If you're going to reply with a childish, unproffessional remark do us all
a favor and be quiet. If I say something stupid and nobody replies I get the
hint usually.

2002-01-18 06:06:17

by Matthew Johnson

[permalink] [raw]
Subject: Re: vm philosophising

On Thursday 17 January 2002 09:18 pm, Ryan Cumming wrote:
> On January 17, 2002 20:58, Matthew Johnson wrote:
> > Well what is the different requirements for desktop use versus, the
> > server in terms of virtual memory? What I am doing on my system with
> > MP3's playing, running Xfree86 on a SuSE 7.3 system will be different
> > from a server running DMBS's.

Slight typo In last line, should read "What am I" not "What I am" perhaps
this lead to some confusion. Shouldn't write to this list after driving for
almost 12 hours :).

Matt

2002-01-18 14:43:17

by Tommy Faasen

[permalink] [raw]
Subject: Re: vm philosophising

On Fri, Jan 18, 2002 at 02:36:02AM -0200, Rik van Riel wrote:
> On Fri, 18 Jan 2002, Bosko Radivojevic wrote:
>
> > There is no way to make one good VM for all possible situations. But,
> > you can tune/make one VM to work great on large DBMS (e.g.) and
> > tune/make another one to work great on ordinary desktop systems
>
> This is an interesting assertion ... but up to date nobody has
> been able to tell me what exactly should be different between
> these two mythical VMs ;)
>
I have no clue about VM's but I can imagine that for example the following situations have different requirements:
1-Desktop: many "small" apps, I believe exe's remain in memory and data is written to disk? Anyway I can imagine fragmentation and latency is an issue here.
2-DBMS: 1 or 2 big programs which sometimes even do their own memory management.Fragmentation and latency isn't issue here I think however moving ltos of data to and from swap is.
3-Webserver: for example apache with many childs being created under high load and killed under low load. The data is always small (in case of static pages). So a lot of small swaps? Latency is not as much as un issue but I can imagine that fragmentation can be an issue?

I think these 3 situations behave very differently, but then again it's just what I think. I can also imagine that more situations are possible but not many.
I also indictated that we have a few parameters we can optimise for like latency, fragmentation and moving a lot small chunks, or occasionally 1 big chunk.

I know from an AI perspective that optimize for 3 different parameters is difficult.
> regards,
>
> Rik
> --
> "Linux holds advantages over the single-vendor commercial OS"
> -- Microsoft's "Competing with Linux" document
>
> http://www.surriel.com/ http://distro.conectiva.com/
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2002-01-18 15:53:12

by listmail

[permalink] [raw]
Subject: Re: vm philosophising



On Fri, 18 Jan 2002, Tommy Faasen wrote:


> I have no clue about VM's but I can imagine that for example the following situations have different requirements:
> 1-Desktop: many "small" apps, I believe exe's remain in memory and data is written to disk? Anyway I can imagine fragmentation and latency is an issue here.
> 2-DBMS: 1 or 2 big programs which sometimes even do their own memory management.Fragmentation and latency isn't issue here I think however moving ltos of data to and from swap is.
> 3-Webserver: for example apache with many childs being created under high load and killed under low load. The data is always small (in case of static pages). So a lot of small swaps? Latency is not as much as un issue but I can imagine that fragmentation can be an issue?

There is another situation to consider which I think more typical in the
"Real World"

4 - The Typical Web server Environment, in many many companies that I have
seen.
The machine runs both the DBMS(mysql, Oracle, etc) and the web server
Apache. Therefore a balence of memory use needs to be struck between the
two applications. Then ususally the system also has a development
environment at the ready incase changes need to be made quickly from the
console. While this may not be true as things scale up for a business and
functions get separated to different machines. Many startups and small
businesses that I have worked with that have turned to LINUX for the OS
becuase the TCO, it offers over the other guys, tend to have just that
single does everything box.

-Bill

2002-01-18 16:54:31

by Wilhelm Nuesser

[permalink] [raw]
Subject: Re: clarification about redhat and vm

Alan Cox wrote:
>
> > "If redhat doesn't use the -aa VM " was a short form of "if redhat
> > cannot see the goodness of all the bugfixing work that happened between
> > the 2.4.9 VM and any current branch 2.4, and so if they keep shipping
> > 2.4.9 VM as the best one for DBMS and critical VM apps like the SAP
> > benchmark".
>
> The RH VM is totally unrelated to the crap in 2.4.9 vanilla. The SAP comment
> begs a question. 2.4.10 seems to have problems remembering to actually
> do fsync()'s. How much of your SAP benchmark is from fsync's that dont
> happen ? Do you get the same values with 2.4.18-aa ?


Well, basically we checked the thing many times with quite different
kernels.
Our current tests - which show exactly the same results as
2.4.[10,14,15] - run
on the new "official" SuSE kernel 2.4.16. Again, we observe a
performance increase
in high swap situations of about a factor of ten compared to 2.4.[7,9].

IMO, this shows that errors like fsync etc. are _not_ responsible for
the improved
performance.

But of course, we will check the newer kernels as well. I think we
could live
with another factor of ten ...



--
Best regards
Willi

-----------------------------------
Willi Nuesser
SAP Linuxlab

2002-01-18 16:59:31

by Wilhelm Nuesser

[permalink] [raw]
Subject: Re: clarification about redhat and vm

Alan Cox wrote:

>>"If redhat doesn't use the -aa VM " was a short form of "if redhat
>>cannot see the goodness of all the bugfixing work that happened between
>>the 2.4.9 VM and any current branch 2.4, and so if they keep shipping
>>2.4.9 VM as the best one for DBMS and critical VM apps like the SAP
>>benchmark".
>>
>
>The RH VM is totally unrelated to the crap in 2.4.9 vanilla. The SAP comment
>begs a question. 2.4.10 seems to have problems remembering to actually
>do fsync()'s. How much of your SAP benchmark is from fsync's that dont
>happen ? Do you get the same values with 2.4.18-aa ?
>
Well, basically we checked the thing many times with quite different
kernels.
Our current tests - which show exactly the same results as
2.4.[10,14,15] - run
on the new "official" SuSE kernel 2.4.16. Again, we observe a
performance increase
in high swap situations of about a factor of ten compared to 2.4.[7,9].

IMO, this shows that errors like fsync etc. are _not_ responsible for
the improved
performance.


But of course, we will check the newer kernels as well. I think we
could live with another
factor of ten ...

Best regards
Willi

-----------------------

Willi N??er
SAP LinuxLab




2002-01-18 18:40:59

by Oliver Xymoron

[permalink] [raw]
Subject: Re: vm philosophising

On Fri, 18 Jan 2002, Rik van Riel wrote:

> On Fri, 18 Jan 2002, Bosko Radivojevic wrote:
>
> > There is no way to make one good VM for all possible situations. But,
> > you can tune/make one VM to work great on large DBMS (e.g.) and
> > tune/make another one to work great on ordinary desktop systems
>
> This is an interesting assertion ... but up to date nobody has
> been able to tell me what exactly should be different between
> these two mythical VMs ;)

There is another VM that has a property that people would like:
deterministically handling memory exhaustion. Unfortunately, that VM
probably can't co-exist with over-commit and the performance gains that
affords.

--
"Love the dolphins," she advised him. "Write by W.A.S.T.E.."

2002-01-18 19:06:45

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: clarification about redhat and vm

On Fri, Jan 18, 2002 at 05:46:33PM +0100, Wilhelm Nuesser wrote:
> Alan Cox wrote:
>
> >>"If redhat doesn't use the -aa VM " was a short form of "if redhat
> >>cannot see the goodness of all the bugfixing work that happened between
> >>the 2.4.9 VM and any current branch 2.4, and so if they keep shipping
> >>2.4.9 VM as the best one for DBMS and critical VM apps like the SAP
> >>benchmark".
> >>
> >
> >The RH VM is totally unrelated to the crap in 2.4.9 vanilla. The SAP comment
> >begs a question. 2.4.10 seems to have problems remembering to actually
> >do fsync()'s. How much of your SAP benchmark is from fsync's that dont
> >happen ? Do you get the same values with 2.4.18-aa ?
> >
> Well, basically we checked the thing many times with quite different
> kernels.
> Our current tests - which show exactly the same results as
> 2.4.[10,14,15] - run
> on the new "official" SuSE kernel 2.4.16. Again, we observe a
> performance increase
> in high swap situations of about a factor of ten compared to 2.4.[7,9].
>
> IMO, this shows that errors like fsync etc. are _not_ responsible for
> the improved
> performance.

and I assume you were using either ext2 or reiserfs anyways, so the
fsync problem never affected you since the first place (also with older
kernels) I believe.

Andrea

2002-01-18 19:13:45

by Alan

[permalink] [raw]
Subject: Re: vm philosophising

> There is another VM that has a property that people would like:
> deterministically handling memory exhaustion. Unfortunately, that VM
> probably can't co-exist with over-commit and the performance gains that
> affords.

It can definitely co-exist. Overcommit control is just a book keeping
exercise on address space commits.

2002-01-18 20:17:47

by David Schwartz

[permalink] [raw]
Subject: Re: vm philosophising



On Fri, 18 Jan 2002 19:23:47 +0000 (GMT), Alan Cox wrote:

>Overcommit control is just a book keeping
>exercise on address space commits.

A bookkeeping technique developed by Arthur Anderson.

DS


2002-01-18 21:27:41

by Alan

[permalink] [raw]
Subject: Re: vm philosophising

> On Fri, 18 Jan 2002 19:23:47 +0000 (GMT), Alan Cox wrote:
>
> >Overcommit control is just a book keeping
> >exercise on address space commits.
>
> A bookkeeping technique developed by Arthur Anderson.

Hardly, and for many workloads its actually a very good thing to do. Of
course there is always a small mostly theoretical risk of doing an Enron

2002-01-18 15:36:11

by Mr. Shannon Aldinger

[permalink] [raw]
Subject: Re: vm philosophising

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, 18 Jan 2002, Rik van Riel wrote:

> On Fri, 18 Jan 2002, Bosko Radivojevic wrote:
>
> > There is no way to make one good VM for all possible situations. But,
> > you can tune/make one VM to work great on large DBMS (e.g.) and
> > tune/make another one to work great on ordinary desktop systems
>
> This is an interesting assertion ... but up to date nobody has
> been able to tell me what exactly should be different between
> these two mythical VMs ;)
>
I can see two different "VMs". I say "VMs" because it could be the same
code with different magic numbers to control its behavior.

>From a file & database point of view throughput is the most crictical
aspect. Both disk and network throughput. Interactive response on such
systems isn't as critical as most of the time it will sit there processing
queries or sending files.

>From a desktop point of view interactive response is critical, however
disk and network throughput also have to have a fine balance. Maybe the
balance is three way here between interactive response, disk throughput
and network throughput.

Perhaps having a VM system that you select your main focus server vs
desktop would be the way to go. Also the end-user should be able to adjust
this balance. Say a person selected desktop, and is a graphic artist, they
may not care as much about network thoroughput and rather push up
interactive response and disk throughput at the expense of the network
thoroughput.

Regards.
PS: IANAVMP (I Am Not A VM Programmer)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iEYEARECAAYFAjxIRZAACgkQwtU6L/A4vVDUTQCdG4Pg4hYGPvRXN9kBVfDyWBbD
bnsAnigMlPA21izLJUhKjZcTeeaaK9IC
=EKri
-----END PGP SIGNATURE-----


2002-01-19 04:37:35

by David Luyer

[permalink] [raw]
Subject: RE: vm philosophising

Alan Cox wrote:
> > There is another VM that has a property that people would like:
> > deterministically handling memory exhaustion. Unfortunately, that VM
> > probably can't co-exist with over-commit and the
> performance gains that
> > affords.
>
> It can definitely co-exist. Overcommit control is just a book keeping
> exercise on address space commits.

And that's a _definitely_; other OS's have done it. Digital Unix, for
one,
on the basis of a file called 'swapdefault', swapped between overcommit
and precommit modes. I was rather disappointed when I first tried to
enable overcommit mode on Solaris 2.x (where 'x' was probably somewhere
around 4) and searched for quite some time before giving up and deciding
it wasn't a tunable option...

Although I've never actually deliberately run a system in precommit
mode,
it always used to be the first thing to "fix" on a Digital Unix box, and
when I discovered Solaris had the same "flaw" my suggestion was to move
the affected applications (large applications which fork before exec'ing
or fork short lived-children, at around 1/2Gb+ each, which should just
be short-lived COW shared mappings and not exhaust memory) to Linux.

And while precommit may be something people ask for, I'd have to say
many
of them would, having experienced the difference on identical hardware,
then realise what a bad idea it was and go back to the current mode.
That is, it sounds like a big waste of time to implement the
'traditional'
behaviour which Linux is already so much better than.

David.
--
David Luyer Phone: +61 3 9674 7525
Network Manager P A C I F I C Fax: +61 3 9699 8693
Pacific Internet (Australia) I N T E R N E T Mobile: +61 4 1111 BYTE
http://www.pacific.net.au/ NASDAQ: PCNTF

2002-01-19 04:43:44

by David Luyer

[permalink] [raw]
Subject: RE: vm philosophising

I wrote:
> Alan Cox wrote:
> > > There is another VM that has a property that people would like:
> > > deterministically handling memory exhaustion.
> > > Unfortunately, that VM
> > > probably can't co-exist with over-commit and the
> > > performance gains that
> > > affords.
> >
> > It can definitely co-exist. Overcommit control is just a
> > book keeping
> > exercise on address space commits.

[...]

and the comment I somehow missed putting on the end:

If you want to philosophise about VM strategies, think of
overcommit as "ethernet" and precommit as "token ring".

David.

2002-01-19 05:46:54

by David Weinehall

[permalink] [raw]
Subject: Re: vm philosophising

On Sat, Jan 19, 2002 at 03:49:02PM +1100, David Luyer wrote:
> I wrote:
> > Alan Cox wrote:
> > > > There is another VM that has a property that people would like:
> > > > deterministically handling memory exhaustion.
> > > > Unfortunately, that VM
> > > > probably can't co-exist with over-commit and the
> > > > performance gains that
> > > > affords.
> > >
> > > It can definitely co-exist. Overcommit control is just a
> > > book keeping
> > > exercise on address space commits.
>
> [...]
>
> and the comment I somehow missed putting on the end:
>
> If you want to philosophise about VM strategies, think of
> overcommit as "ethernet" and precommit as "token ring".

You mean, that while technically superior, precommit suffers from
a topological problem and the fact that a very expensive concentrator
is needed?! ;-)

Token Ring still lives in the spirit, though it's called FDDI
nowadays...


/David
_ _
// David Weinehall <[email protected]> /> Northern lights wander \\
// Maintainer of the v2.0 kernel // Dance across the winter sky //
\> http://www.acc.umu.se/~tao/ </ Full colour fire </

2002-01-19 13:39:30

by Christoph Rohland

[permalink] [raw]
Subject: Re: clarification about redhat and vm

Hi Andrea,

On Fri, 18 Jan 2002, Andrea Arcangeli wrote:
> and I assume you were using either ext2 or reiserfs anyways, so the
> fsync problem never affected you since the first place (also with
> older kernels) I believe.

It was done on ext2 _and_ against raw devices. Same dendency on both
setups.

Further on I doubt the test is very depended on fsync. It should be
swap io limited since it runs with a way too small memory
configuration.

If you have enough memory the test is not very IO intensive either
despite the fact that a big database is running. To bring the database
really into IO you have to add application servers. (Fujitsu Siemens
took 160 4way Linux servers to saturate a database server running
Solaris on 64way FSC Primepower.)

BTW since we are just bashing VMs: I always hear that 2.2 is so much
better: The first 2.2 kernel which could really survive this test was
2.2.19!

Greetings
Christoph



2002-01-19 13:42:31

by Alan

[permalink] [raw]
Subject: Re: clarification about redhat and vm

> BTW since we are just bashing VMs: I always hear that 2.2 is so much
> better: The first 2.2 kernel which could really survive this test was
> 2.2.19!

That I can believe. With the exception of the dcache balancing problem the
2.2.19/20 VM basically eliminated all 2.2 bug reports on VM behaviour.

Alan

2002-01-19 17:39:05

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: clarification about redhat and vm

On Sat, Jan 19, 2002 at 01:54:04PM +0000, Alan Cox wrote:
> > BTW since we are just bashing VMs: I always hear that 2.2 is so much
> > better: The first 2.2 kernel which could really survive this test was
> > 2.2.19!
>
> That I can believe. With the exception of the dcache balancing problem the
> 2.2.19/20 VM basically eliminated all 2.2 bug reports on VM behaviour.

can you reproduce the dcache problem with this patch applied?

ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.2/2.2.20aa1/00_inode-boot-dynamic-3

The grow of the dcache in 2.2 is bound to the grow of the icache which
is not dynamic in 2.2. (modulo hardlinks but I don't think the guy was
using hardlinks)

Andrea

2002-01-20 02:03:00

by Rob Landley

[permalink] [raw]
Subject: Re: vm philosophising

On Friday 18 January 2002 11:42 pm, David Luyer wrote:

> And while precommit may be something people ask for, I'd have to say
> many
> of them would, having experienced the difference on identical hardware,
> then realise what a bad idea it was and go back to the current mode.
> That is, it sounds like a big waste of time to implement the
> 'traditional'
> behaviour which Linux is already so much better than.
>
> David.

Precommit basically just asks the application to die when it first allocates
memory if it's even possible for it to die in the most pathlogical usage case
of that memory.

I.E. "die up front" instead of "die while running". (There's no possible way
this can improve performance. If you want to never swap, just don't mount a
swap partition.) You don't even have to change the VM's behavior, it can
still copy on write and such. You just add a test to cause allocations to
fail unnecessarily at times.

Throwing in a little extra code on mmaps and allocations to kill a process
wouldn't be too hard. It would be stupid outside of something like a
financial transaction system (and probably even in there), but it technically
shouldn't be all that hard to do.

Unless I missed something...?

Rob

2002-01-20 03:14:59

by Bart Trojanowski

[permalink] [raw]
Subject: Re: C source lines for assembly listing

* Kallol Biswas <[email protected]> [020116 20:35]:
> Hi,
> Does gcc have an option to list the C source line information for
> assembly instructions?

I am not sure what you are asking for... but I will give it a shot. ;)

One of the tools that comes with the package binutils is called objdump.

If you compile your source with -g flag then you can use objdump to
display mixed assembly and C source code.

gcc -g foo.c -o foo.o
objdumpt -S foo.o

I hope this helps.

B.

--
WebSig: http://www.jukie.net/~bart/sig/


Attachments:
(No filename) (526.00 B)
(No filename) (232.00 B)
Download all attachments

2002-01-20 05:50:38

by Stephen Oberholtzer

[permalink] [raw]
Subject: Re: vm philosophising

Why don't we all follow the MSCommit method of VM? We simply allocate 99%
of physical RAM for cache and other non-userspace purposes, and whenever an
application needs memory, pop up a message:
printk("Your system is out of virtual memory. Linux is increasing
your virtual memory size. During this time, memory allocation requests may
fail.\n");

Then spend a few minutes doing hard disk I/O, while exposing bugs in
programs that don't check to make sure that malloc succeeded.


;)



--
Stevie-O

Real programmers use COPY CON PROGRAM.EXE

2002-01-21 15:51:07

by The Doctor What

[permalink] [raw]
Subject: Re: vm philosophising

* Tommy Faasen ([email protected]) [020118 08:47]:
> 2-DBMS: 1 or 2 big programs which sometimes even do their own
> memory management.Fragmentation and latency isn't issue here I
> think however moving ltos of data to and from swap is.

A lot of times a DBMS is bulit that way because they assume they
know better than the OS designer how memory should be managed. Same
reason they usually use raw writting the the drive instead of using
the OS calls.

Is this right or fair? I don't know. But it does imply that if a
VM or FS layer for an OS performs well enough, that a DBM system
might be built that would be built to take advantage of the OS's VM
and FS layer.

Ciao!

--
"When you have to shoot, shoot! Don't talk."
--Tuco (The Good, The Bad, and The Ugly)

The Doctor What: Kaboom! http://docwhat.gerf.org/
[email protected] KF6VNC

2002-01-21 16:16:50

by Mike Harrold

[permalink] [raw]
Subject: Re: vm philosophising

>
> * Tommy Faasen ([email protected]) [020118 08:47]:
> > 2-DBMS: 1 or 2 big programs which sometimes even do their own
> > memory management.Fragmentation and latency isn't issue here I
> > think however moving ltos of data to and from swap is.
>
> A lot of times a DBMS is bulit that way because they assume they
> know better than the OS designer how memory should be managed. Same
> reason they usually use raw writting the the drive instead of using
> the OS calls.

Actually this isn't true. DBMS usually handle their own memory because
everything is done in blocks of the same size. Since this is configured
as part of the DBMS' parameters, it is much better at handling this
than the OS ever could be, once it has garnered the original memory
from the OS. Remember, this space is normally shared memory as well.

As for the FS, DBMS' prefer raw devices for consistency issues (as
well as speed). Raw devices prevent the use of the kernel's internal
buffers for files (thus reducing the number of memory copies involved).

/Mike

2002-01-21 17:57:04

by Bill Davidsen

[permalink] [raw]
Subject: Re: vm philosophising

On Thu, 17 Jan 2002, Matthew Johnson wrote:

> How does one test the VM precisley? Sorry for the ignorance on this subject.

Having been active doing just that, I have been triggering my favorite bad
behaviour and trying to evaluate which (if either) makes the system run
better, as defined by both measurements and "feel."

The two problems I have been seeing are (1) load with low to moderate
memory, and (2) sudden i/o bursts freezing the system when doing large
writes (CD image creation).

My test for #1 is simple, I compile the 2.4.16 stock kernel after booting
with mem=64m or mem=128m options. I have a batch of files to hack into
something I can post, and I ran on an Athlon 1400 and dual Celeron 500
system, so I have moderate UP and SMP machines of similar performance when
full memory is used. I compile with:
make clean; make dep
make bzImage modules MAKE='make -j7'

I took all these numbers with the intent of posting, but the runs finished
at 0630 this morning and I haven't the time yet. Gut feeling is that
17-rmap-11c works better under small memory, 18pre2aa2 was better when
creating CD images on the Athlon, ran out of time for the SMP.

Neither crashed, hung, or caused the OOM to commit procedural genocide,
which plain 2.4.17 does. the -aa kernel was also tested with my own patch
for intermediate disk loads, I will post when I'm sure it's actually
better by enough to matter. I believe the extra tuning in -aa allows
better large i/o performance if you match bdflush to your load.

Chech large i/o by sync() followed by a fast CD build from wav files, on
fast disk. When the disk light come on hard, grab a window and try to wave
it around, or change X virtual desktops. Chances are that you will get bad
to nil response if you have fast disk and CPU. I get a whole 600MB in
memory before the light comes on :-(

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.