2004-01-07 21:53:56

by Nathan Scott

[permalink] [raw]
Subject: Re: Fw: Performance drop 2.6.0-test7 -> 2.6.1-rc2

On Wed, Jan 07, 2004 at 02:30:42AM -0800, Andrew Morton wrote:
>
> Nathan, did anything change in XFS which might explain this?

Just been back through the revision history, and thats a definate
"no" - very little changed in 2.6 XFS while the big 2.6 freeze was
on, and since then too (he says, busily preparing a merge tree).

> I see XFS has some private readahead code. Anything change there?

No, and that readahead code is used for metadata only - for file data
we're making use of the generic IO path code.

cheers.

> Date: Wed, 7 Jan 2004 10:25:21 +0100
> From: Jan Kasprzak <[email protected]>
> To: [email protected]
> Subject: Performance drop 2.6.0-test7 -> 2.6.1-rc2
>
>
> Hello, kernel hackers!
>
> Yesterday I have upgraded the kernel on my FTP server to 2.6.1-rc2
> (from 2.6.0-test7) and today I have found the server struggling under
> load average of 40+ and the system was barely usable. The main load on the
> server is ProFTPd serving mainly ISO 9660 images using sendfile().
>
> Under 2.6.1-rc2 the server put out about 30 Mbit/s of data and load
> average was >40. According to top(1) under 2.6.1-rc2, the CPU was about
> 30% system and 70% iowait. Killing proftpds of course helped and the load
> went back to normal. After rebooting back to 2.6.0-test7 the load was 1-3
> and bandwidth >50 Mbit/s (at which point it is probably limited by
> bandwidth of FTP clients). During past releases of Linux distributions
> the same hardware was able to serve ~400 FTP clients and >220 Mbit/s of
> bandwidth.
>
> The system is Athlon 850, 1.2GB RAM (CONFIG_HIGHMEM4G),
> seven IDE disks, six of them joined to a 800GB LVM1 volume, which holds
> an XFS filesystem with the anonymous FTP tree.
>
> I think I can see a one possible explanation to this behaviour:
> I have written a simple script to parse /proc/diskstats, and according
> to its output, it seems that a block layer in 2.6.1-rc2 issues a several
> times larger read operations than 2.6.0-test7. The later has requests
> between 64 and 128 KB (and never bigger than 128 KB), while former can
> apparently issue as big as 1MByte requests. Anyway, the output of my
> script (5 second average) looks like this under 2.6.1-rc2:
>
> disk rops read rsize rload wops write wsize wload pend util ioload
> name [1/s] [KB/s] [KB] [req] [1/s] [KB/s] [KB] [req] [1] [%] [req]
> hda 11.6 50 4.3 0.3 35.6 550 15.4 0.4 0 38.0 0.7
> hde 104.6 6562 62.7 24.1 0.0 0 0.0 0.0 0 21.4 24.1
> hdh 25.0 100 4.0 0.0 0.0 0 0.0 0.0 0 1.4 0.0
> hdi 62.4 7798 125.0 12.0 0.0 0 0.0 0.0 38 95.2 9.9
> hdj 20.0 15718 785.9 133.4 0.0 0 0.0 0.0 148 100.2 132.2
> hdk 93.0 11807 127.0 67.2 0.0 0 0.0 0.0 23 100.2 42.6
> hdl 58.2 9050 155.5 16.5 0.0 0 0.0 0.0 34 95.2 11.8
>
> disk rops read rsize rload wops write wsize wload pend util ioload
> name [1/s] [KB/s] [KB] [req] [1/s] [KB/s] [KB] [req] [1] [%] [req]
> hda 47.0 1303 27.7 1.9 13.6 170 12.5 2.5 0 71.5 4.3
> hde 68.8 4362 63.4 17.4 0.0 0 0.0 0.0 0 16.2 17.4
> hdh 283.0 16861 59.6 92.3 0.0 0 0.0 0.0 0 71.2 92.3
> hdi 49.6 9586 193.3 17.1 0.0 0 0.0 0.0 26 89.0 18.6
> hdj 27.0 9248 342.5 137.2 0.0 1 0.0 0.0 126 100.3 133.4
> hdk 298.2 11202 37.6 14.4 0.0 0 0.0 0.0 34 93.3 17.8
> hdl 87.8 9549 108.8 17.0 0.0 0 0.0 0.0 31 85.4 19.8
>
> and the following is from 2.6.0-test7:
>
> rops read rsize rload wops write wsize wload pend util ioload
> name [1/s] [KB/s] [KB] [req] [1/s] [KB/s] [KB] [req] [1] [%] [req]
> hda 4.6 21 4.5 0.0 19.2 252 13.1 0.2 1 15.5 0.2
> hde 42.6 2660 62.4 0.2 0.0 0 0.0 0.0 0 10.9 0.2
> hdh 12.8 601 46.9 0.1 0.0 0 0.0 0.0 0 5.8 0.1
> hdi 40.2 3992 99.3 0.3 0.0 0 0.0 0.0 0 25.0 0.3
> hdj 30.8 2990 97.1 0.4 0.0 0 0.0 0.0 1 31.3 0.4
> hdk 42.8 4843 113.2 0.3 0.0 0 0.0 0.0 0 24.3 0.3
> hdl 81.8 9879 120.8 0.5 0.0 0 0.0 0.0 0 39.2 0.5
>
> disk rops read rsize rload wops write wsize wload pend util ioload
> name [1/s] [KB/s] [KB] [req] [1/s] [KB/s] [KB] [req] [1] [%] [req]
> hda 8.4 34 4.1 0.2 57.2 574 10.0 4.2 0 41.3 4.4
> hde 26.0 1461 56.2 0.2 0.0 0 0.0 0.0 0 8.1 0.2
> hdh 54.4 3272 60.1 0.3 0.0 0 0.0 0.0 0 14.0 0.3
> hdi 35.4 3590 101.4 0.3 0.6 6 9.3 0.0 0 29.0 0.3
> hdj 61.8 5895 95.4 1.0 0.2 2 8.0 0.0 2 57.4 1.1
> hdk 44.2 5042 114.1 0.2 0.4 2 6.0 0.0 0 21.2 0.2
> hdl 13.6 1136 83.5 0.1 0.0 0 0.0 0.0 0 11.3 0.1
>
> note the "rsize" column.
>
> But of course the reason can be anything else (changes in XFS,
> maybe?).
>
> I am willing to do more testing, but since this is a production
> server, I cannot go through each kernel version between -test7 and 2.6.1-rc2.
>
> Can you explain (or better, suggest a fix for) this behaviour?
> Thanks,
>
> -Yenya
>
> --
> | Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
> | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
> | http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ |
> | I actually have a lot of admiration and respect for the PATA knowledge |
> | embedded in drivers/ide. But I would never call it pretty:) -Jeff Garzik |
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

--
Nathan


2004-01-08 09:55:29

by Jan Kasprzak

[permalink] [raw]
Subject: Re: Fw: Performance drop 2.6.0-test7 -> 2.6.1-rc2

Nathan Scott wrote:
: On Wed, Jan 07, 2004 at 02:30:42AM -0800, Andrew Morton wrote:
: >
: > Nathan, did anything change in XFS which might explain this?
:
: Just been back through the revision history, and thats a definate
: "no" - very little changed in 2.6 XFS while the big 2.6 freeze was
: on, and since then too (he says, busily preparing a merge tree).
:
: > I see XFS has some private readahead code. Anything change there?
:
: No, and that readahead code is used for metadata only - for file data
: we're making use of the generic IO path code.
:
I have done further testing:

- this is reliable: repeated boot back to 2.6.1-rc2 makes the problem
appear again (high load, system slow has hell), booting back
to -test7 makes it disappear.
- this is probably not a compiler/toolchain issue (tried to compile the kernel
using two different versions of gcc and system environment (RedHat 9
one and the latest Gentoo).
- I have seen something similar on my fileserver (altough I was not able
to research it thoroughly, because I had to went back to 2.4).
The fileserver is dual Athlon MP, 1GB RAM, 2TB of storage on
RAID-5 array on the 3ware controller. Most of its load is serving
home directories via NFS (cca 2200 users), and few Windows clients
via Samba 3.0.1. Filesystem is XFS as well, but I dont know whether
this can be a problem. I have tried 2.6.0 only, not earlier or
later revisions. 2.4.24-pre (XFS) works OK here.

If I can guess, I think it would be a problem with block layer
or I/O request scheduling rather than XFS.

-Yenya

--
| Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ |
| I actually have a lot of admiration and respect for the PATA knowledge |
| embedded in drivers/ide. But I would never call it pretty:) -Jeff Garzik |

2004-01-08 10:17:17

by Andrew Morton

[permalink] [raw]
Subject: Re: Fw: Performance drop 2.6.0-test7 -> 2.6.1-rc2

Jan Kasprzak <[email protected]> wrote:
>
> Nathan Scott wrote:
> : On Wed, Jan 07, 2004 at 02:30:42AM -0800, Andrew Morton wrote:
> : >
> : > Nathan, did anything change in XFS which might explain this?
> :
> : Just been back through the revision history, and thats a definate
> : "no" - very little changed in 2.6 XFS while the big 2.6 freeze was
> : on, and since then too (he says, busily preparing a merge tree).
> :
> : > I see XFS has some private readahead code. Anything change there?
> :
> : No, and that readahead code is used for metadata only - for file data
> : we're making use of the generic IO path code.
> :
> I have done further testing:
>
> - this is reliable: repeated boot back to 2.6.1-rc2 makes the problem
> appear again (high load, system slow has hell), booting back
> to -test7 makes it disappear.

Is the CPU load higher than normal? Excluding I/O wait? If so, can you
profile the kernel while the load is running?

- boot with `profile=1' on the kernel command line

- establish steady state load

- Run readprofile -r

- sleep 120

- readprofile -n -v -m System.map | sort -n +2 | tail -40

(More info in Documentation/basic_profiling.txt)

Thanks.
x

2004-01-08 10:26:41

by Jan Kasprzak

[permalink] [raw]
Subject: Re: Fw: Performance drop 2.6.0-test7 -> 2.6.1-rc2

Andrew Morton wrote:
: Jan Kasprzak <[email protected]> wrote:
: > - this is reliable: repeated boot back to 2.6.1-rc2 makes the problem
: > appear again (high load, system slow has hell), booting back
: > to -test7 makes it disappear.
:
: Is the CPU load higher than normal? Excluding I/O wait?

No, ~30% system is pretty standard for this server. I have looked
just now (2.6.0-test7), and I have 33% system, about 50% nice,
and the rest is user, iowait and idle. Under 2.6.1-rc2 it was about 30%
system and the rest iowait, with small amount of nice and user.
However, the load may be different. It is hard to have any kind of
"fixed" load when you serve data over FTP, HTTP, rsync and do some
other minor tasks (updatedb, current/up2date server, ...).

Do you still want the system profiling info?

-Y.

--
| Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ |
| I actually have a lot of admiration and respect for the PATA knowledge |
| embedded in drivers/ide. But I would never call it pretty:) -Jeff Garzik |

2004-01-08 10:34:26

by Andrew Morton

[permalink] [raw]
Subject: Re: Fw: Performance drop 2.6.0-test7 -> 2.6.1-rc2

Jan Kasprzak <[email protected]> wrote:
>
> Andrew Morton wrote:
> : Jan Kasprzak <[email protected]> wrote:
> : > - this is reliable: repeated boot back to 2.6.1-rc2 makes the problem
> : > appear again (high load, system slow has hell), booting back
> : > to -test7 makes it disappear.
> :
> : Is the CPU load higher than normal? Excluding I/O wait?
>
> No, ~30% system is pretty standard for this server. I have looked
> just now (2.6.0-test7), and I have 33% system, about 50% nice,
> and the rest is user, iowait and idle. Under 2.6.1-rc2 it was about 30%
> system and the rest iowait, with small amount of nice and user.
> However, the load may be different. It is hard to have any kind of
> "fixed" load when you serve data over FTP, HTTP, rsync and do some
> other minor tasks (updatedb, current/up2date server, ...).

OK.

> Do you still want the system profiling info?

Nope.

It would be interesting to run some simple benchmarks (dbench, iozone,
tiobench, etc) on a relatively idle system.

After that, it'd be a matter of searching through kernel versions, which
you presumably cannot do. Or eliminating device mapper from the picture,
which is also presumably not an option.


Have you run `hdparm' to check that all those disks have DMA enabled? I
guess they have.


2004-01-08 11:02:37

by Jan Kasprzak

[permalink] [raw]
Subject: Re: Fw: Performance drop 2.6.0-test7 -> 2.6.1-rc2

Andrew Morton wrote:
:
: It would be interesting to run some simple benchmarks (dbench, iozone,
: tiobench, etc) on a relatively idle system.
:
I will try to do, but it is hard to get a "relatively idle" system.

: After that, it'd be a matter of searching through kernel versions, which
: you presumably cannot do. Or eliminating device mapper from the picture,
: which is also presumably not an option.
:
I'll try to walk between kernel versions, but I cannot reboot this
machine too often ...
:
: Have you run `hdparm' to check that all those disks have DMA enabled? I
: guess they have.
:
No, I did not run hdparm, but the kernel messages from
2.6.1-rc2 were:

Jan 6 17:05:06 odysseus kernel: Linux version 2.6.1-rc2 (root@atlas) (gcc version 3.2.3 20030422 (Gentoo Linux 1.4 3.2.3-r3, propolice)) #1 Tue Jan 6 16:47:51 CET 2004
[...]
Jan 6 17:05:09 odysseus kernel: hda: 40011300 sectors (20485 MB) w/512KiB Cache, CHS=39693/16/63, UDMA(66)
Jan 6 17:05:09 odysseus kernel: hda: hda1 hda2 hda3 hda4 < hda5 hda6 >
Jan 6 17:05:09 odysseus kernel: hde: max request size: 64KiB
Jan 6 17:05:09 odysseus kernel: hde: 150136560 sectors (76869 MB) w/1916KiB Cache, CHS=65535/16/63, UDMA(100)
Jan 6 17:05:09 odysseus kernel: hde: hde1 hde2 hde3
Jan 6 17:05:09 odysseus kernel: hdh: max request size: 64KiB
Jan 6 17:05:09 odysseus kernel: hdh: 156301488 sectors (80026 MB) w/2048KiB Cache, CHS=65535/16/63, UDMA(100)
Jan 6 17:05:09 odysseus kernel: hdh: hdh1 hdh2 hdh3
Jan 6 17:05:09 odysseus kernel: hdi: max request size: 1024KiB
Jan 6 17:05:09 odysseus kernel: hdi: 361882080 sectors (185283 MB) w/7965KiB Cache, CHS=22526/255/63, UDMA(100)
Jan 6 17:05:09 odysseus kernel: hdi: hdi1 hdi2 hdi3
Jan 6 17:05:09 odysseus kernel: hdj: max request size: 1024KiB
Jan 6 17:05:09 odysseus kernel: hdj: 361882080 sectors (185283 MB) w/7965KiB Cache, CHS=22526/255/63, UDMA(100)
Jan 6 17:05:09 odysseus kernel: hdj: hdj1 hdj2 hdj3
Jan 6 17:05:09 odysseus kernel: hdk: max request size: 1024KiB
Jan 6 17:05:09 odysseus kernel: hdk: 361882080 sectors (185283 MB) w/7965KiB Cache, CHS=22526/255/63, UDMA(100)
Jan 6 17:05:09 odysseus kernel: hdk: hdk1 hdk2 hdk3
Jan 6 17:05:09 odysseus kernel: hdl: max request size: 1024KiB
Jan 6 17:05:09 odysseus kernel: hdl: 361882080 sectors (185283 MB) w/7965KiB Cache, CHS=22526/255/63, UDMA(100)
Jan 6 17:05:09 odysseus kernel: hdl: hdl1 hdl2 hdl3

So judging from the "UDMA(100)" strings I think DMA was
enabled then.

Do you think it cannot be something with the read request size?
I have never seen a read request >128KB on 2.6.0-test7 (according to
average values computed from /proc/diskstats), but in 2.6.1-rc2 I
often see bigger requests.

The problem may also be with the IDE controllers. Were there
any significant changes between 2.6.0-test7 and 2.6.1-rc2 in this area?

/dev/hda is on the southbridge:
00:04.1 IDE interface: VIA Technologies, Inc. VT82C586/B/686A/B PIPC Bus Master IDE (rev 10)

/dev/hde and /dev/hdh are on CMD680 card:
00:0a.0 RAID bus controller: CMD Technology Inc PCI0680 (rev 02)

others are on the on-board Promise controller:
00:11.0 Unknown mass storage controller: Promise Technology, Inc. 20265 (rev 02)

-Yenya

--
| Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ |
| I actually have a lot of admiration and respect for the PATA knowledge |
| embedded in drivers/ide. But I would never call it pretty:) -Jeff Garzik |

2004-01-08 12:07:47

by Christoph Hellwig

[permalink] [raw]
Subject: Re: Fw: Performance drop 2.6.0-test7 -> 2.6.1-rc2

On Thu, Jan 08, 2004 at 10:54:27AM +0100, Jan Kasprzak wrote:
> I have done further testing:
>
> - this is reliable: repeated boot back to 2.6.1-rc2 makes the problem
> appear again (high load, system slow has hell), booting back
> to -test7 makes it disappear.

Can you put fs/xfs from -test7 into the 2.6.1-rc tree and test with that?

2004-01-08 15:12:23

by Jan Kasprzak

[permalink] [raw]
Subject: Re: Fw: Performance drop 2.6.0-test7 -> 2.6.1-rc2

Jan Kasprzak wrote:
: Now I will try to boot 2.6.0-test9 and we will see if it is
: similar to -rc2 or not.
:
-test9 seems to work OK (the same as -test7). I will try -test11
tomorrow.

-Y.

--
| Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ |
| I actually have a lot of admiration and respect for the PATA knowledge |
| embedded in drivers/ide. But I would never call it pretty:) -Jeff Garzik |

2004-01-08 15:04:20

by Jan Kasprzak

[permalink] [raw]
Subject: Re: Fw: Performance drop 2.6.0-test7 -> 2.6.1-rc2

Christoph Hellwig wrote:
: On Thu, Jan 08, 2004 at 10:54:27AM +0100, Jan Kasprzak wrote:
: > I have done further testing:
: >
: > - this is reliable: repeated boot back to 2.6.1-rc2 makes the problem
: > appear again (high load, system slow has hell), booting back
: > to -test7 makes it disappear.
:
: Can you put fs/xfs from -test7 into the 2.6.1-rc tree and test with that?

I did that. Under 2.6.1-rc2 wit 2.6.0-test7 fs/xfs subtree the load
went up to >40 two minutes after boot. So it is not XFS related.

I have also ran hdparm -t (under a load, though) on all of my
physical disks, and the numbers on 2.6.1-rc2 seems to be more-or-less
the same as under 2.6.0-test7.

Now I will try to boot 2.6.0-test9 and we will see if it is
similar to -rc2 or not.

-Y.

--
| Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> |
| GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E |
| http://www.fi.muni.cz/~kas/ Czech Linux Homepage: http://www.linux.cz/ |
| I actually have a lot of admiration and respect for the PATA knowledge |
| embedded in drivers/ide. But I would never call it pretty:) -Jeff Garzik |

2004-01-08 20:26:15

by J. Ryan Earl

[permalink] [raw]
Subject: Re: Fw: Performance drop 2.6.0-test7 -> 2.6.1-rc2

Andrew Morton wrote:

>Jan Kasprzak <[email protected]> wrote:
>
>
>>Andrew Morton wrote:
>>: Jan Kasprzak <[email protected]> wrote:
>>: > - this is reliable: repeated boot back to 2.6.1-rc2 makes the problem
>>: > appear again (high load, system slow has hell), booting back
>>: > to -test7 makes it disappear.
>>
>>

I was about to post something about this as well. I recently went to
upgrade a production server in a datacenter. I wanted to use a 2.6
kernel: I tried 2.6.0-mm1 and 2.6.1-rc1-mm1--needed the siimage
patches. User space CPU usage was WAY high. I'm talking 2x or more
than normal, though sys time was low. Processes that had no connections
and should have used 0% CPU were using 1% CPU when no one was even
connected. And a loaded process which should only use ~25% CPU was
using 60% CPU. Very unresponsive, very slow. I ended up staying with a
2.4.23 kernel with my required patches. I didn't try any other 2.6
kernels, so I can't say which version this problem appeared in, but I
DEFINITELY noticed it and it is repeatable. I didn't have time to test
further.

-ryan