2002-03-13 13:17:09

by Peter Zaitsev

[permalink] [raw]
Subject: MMAP vs READ/WRITE

Hello linux-kernel,

wrote you about possibilities of using MMAP in performing I/O.
So I decided to check if it really gives any benefit under Linux. I
used stock system (IDE PIII-500, 768M RAM) and 2.4.19pre1aa1

I've tested 2 file sizes - 200M and 2000M to see how the fact
content fits to cache will affect the performance. I used 1Kb
blocks to have it the same as MYISAM uses for keycache.
There were two tests. One is to read file convent sequentially and
other was to random I/O - read and write to random locations in
1K sizes. The reads/writes was it 50/50 proportion. The number of
random IOs was 10000. File system used was EXT3.
Before each test the cache was brought to stable state by
"cat test.dat > /dev/null"


2000M
Test Time Best Time Worst User time System time
Seq. read() 1m28.706s 2m20.358s 0m0.920s 0m26.550s
Seq mmap() 2m4.718s 4m41.674s 0m8.100s 0m11.040s
Rnd read() 1m49.413s 2m46.212s 0m0.010s 0m0.470s
Rnd mmap() 1m17.349s 2m13.482s 0m0.070s 0m0.430s


200M
Test Time Best Time Worst User time System time
Seq. read() 0m1.465s 0m1.481s 0m0.050s 0m1.290s
Seq mmap() 0m1.206s 0m1.518s 0m0.980s 0m0.130s
Rnd read() 0m0.130s 0m0.134s 0m0.020s 0m0.100s
Rnd mmap() 0m0.079s 0m0.082s 0m0.050s 0m0.020s



2000M 4K block (to compare)
Test Time Best Time Worst User time System time
Seq. read() 1m28.328s 2m11.609s 0m0.260s 0m23.510s
Seq mmap() 2m32.768s 4m28.321s 0m8.090s 0m11.240s
Rnd read() 1m6.351s 1m46.149s 0m0.040s 0m0.400s
Rnd mmap() 1m10.707s 1m57.281s 0m0.280s 0m0.510s




200M 32 byte block (to compare)
Test Time Best Time Worst User time System time
Seq. read() 0m8.076s 0m9.404s 0m1.730s 0m6.620s
Seq mmap() 0m1.227s 0m1.237s 0m1.140s 0m0.080s
Rnd read() 0m0.074s 0m0.085s 0m0.010s 0m0.070s
Rnd mmap() 0m0.029s 0m0.030s 0m0.010s 0m0.020s



So I would say mmap is not really optimized nowdays in Linux and so
read() may be wining in cases it should not. May be read-ahead is
used with read and is not used with mmap.



P.S if you're interested I can send you complete source



--
Best regards,
Peter mailto:[email protected]


2002-03-13 13:42:23

by Oleg Drokin

[permalink] [raw]
Subject: Re: MMAP vs READ/WRITE

Hello!

On Wed, Mar 13, 2002 at 04:17:18PM +0300, Peter Zaitsev wrote:
> So I would say mmap is not really optimized nowdays in Linux and so
> read() may be wining in cases it should not. May be read-ahead is
> used with read and is not used with mmap.

how about reading manual page on madvise(2) and redoing your test?

Also cache is best cleaned by unmounting filesystem in question
and then mounting it back.

Bye,
Oleg

2002-03-13 13:58:54

by Rik van Riel

[permalink] [raw]
Subject: Re: MMAP vs READ/WRITE

On Wed, 13 Mar 2002, Peter Zaitsev wrote:

> So I would say mmap is not really optimized nowdays in Linux and so
> read() may be wining in cases it should not. May be read-ahead is
> used with read and is not used with mmap.

Both guesses are correct.

Rik
--
<insert bitkeeper endorsement here>

http://www.surriel.com/ http://distro.conectiva.com/

2002-03-13 14:48:15

by Peter Zaitsev

[permalink] [raw]
Subject: Re[2]: MMAP vs READ/WRITE

Hello Rik,

Wednesday, March 13, 2002, 4:58:20 PM, you wrote:

Would you like to say me with rmap patches the situation should be
different ?

RvR> On Wed, 13 Mar 2002, Peter Zaitsev wrote:

>> So I would say mmap is not really optimized nowdays in Linux and so
>> read() may be wining in cases it should not. May be read-ahead is
>> used with read and is not used with mmap.

RvR> Both guesses are correct.

RvR> Rik



--
Best regards,
Peter mailto:[email protected]

2002-03-13 14:58:48

by Rik van Riel

[permalink] [raw]
Subject: Re[2]: MMAP vs READ/WRITE

On Wed, 13 Mar 2002, Peter Zaitsev wrote:

> RvR> On Wed, 13 Mar 2002, Peter Zaitsev wrote:
>
> >> So I would say mmap is not really optimized nowdays in Linux and so
> >> read() may be wining in cases it should not. May be read-ahead is
> >> used with read and is not used with mmap.
>
> RvR> Both guesses are correct.
>
> Would you like to say me with rmap patches the situation should be
> different ?

That would be a bit premature since this part of the code
hasn't been touched by -rmap ;)

It is something that still needs fixing, though.

regards,

Rik
--
<insert bitkeeper endorsement here>

http://www.surriel.com/ http://distro.conectiva.com/

2002-03-13 15:17:31

by Peter Zaitsev

[permalink] [raw]
Subject: Re[3]: MMAP vs READ/WRITE

Hello Rik,

Wednesday, March 13, 2002, 5:58:08 PM, you wrote:


RvR> That would be a bit premature since this part of the code
RvR> hasn't been touched by -rmap ;)

RvR> It is something that still needs fixing, though.

:)

The most upsetting thing is the followings:

0 2 0 210736 19472 2000 733188 216 0 5068 2 382 340 4 2 94
0 2 0 210424 20108 1860 729596 219 0 4732 0 319 352 4 1 96
0 2 0 210216 19652 1616 727280 254 0 4718 10 313 298 3 4 93
1 1 0 209756 19988 1744 723940 285 0 4523 14 313 197 6 6 88
0 2 0 209700 20096 1904 722236 223 0 4485 15 307 265 7 5 88


Note some pages coming up from swap.

This is vmstat exactly corresponding to doing read via mmap. So it
looks like current VM preforms in come cases to swap out pages from
mapped files rather then discarding them :(

This of course reduces performance on sequential IO :(



--
Best regards,
Peter mailto:[email protected]

2002-03-14 09:27:00

by Peter Zaitsev

[permalink] [raw]
Subject: Re[2]: MMAP vs READ/WRITE

Hello Oleg,

Wednesday, March 13, 2002, 4:41:58 PM, you wrote:

OD> Hello!

OD> On Wed, Mar 13, 2002 at 04:17:18PM +0300, Peter Zaitsev wrote:
>> So I would say mmap is not really optimized nowdays in Linux and so
>> read() may be wining in cases it should not. May be read-ahead is
>> used with read and is not used with mmap.

OD> how about reading manual page on madvise(2) and redoing your test?

OK. I did but no luck. The results are quite the same.

I think the hugest problem is:

0 2 0 210736 19472 2000 733188 216 0 5068 2 382 340 4 2 94
0 2 0 210424 20108 1860 729596 219 0 4732 0 319 352 4 1 96
0 2 0 210216 19652 1616 727280 254 0 4718 10 313 298 3 4 93
1 1 0 209756 19988 1744 723940 285 0 4523 14 313 197 6 6 88
0 2 0 209700 20096 1904 722236 223 0 4485 15 307 265 7 5 88

So then file is memory mapped and is read from some pages are coming
out from swap instead of being read from file....


OD> Also cache is best cleaned by unmounting filesystem in question
OD> and then mounting it back.

Well. This was not really needed as I repeated the test several times
in a loop without clearing the cache after initial cleaning to see how
stable are results.




--
Best regards,
Peter mailto:[email protected]