2005-10-28 23:44:45

by Chen, Kenneth W

[permalink] [raw]
Subject: kernel performance update - 2.6.14

Kernel performance data for 2.6.14 (released yesterday) is updated at:
http://kernel-perf.sourceforge.net

As expected, results are within run variation compares to 2.6.14-rc5.
No significant deviation found compare to 2.6.14-rc5

Ken Chen
Intel Open Source Technology Center


2005-10-28 23:54:37

by Jeff Garzik

[permalink] [raw]
Subject: Re: kernel performance update - 2.6.14

Chen, Kenneth W wrote:
> Kernel performance data for 2.6.14 (released yesterday) is updated at:
> http://kernel-perf.sourceforge.net
>
> As expected, results are within run variation compares to 2.6.14-rc5.
> No significant deviation found compare to 2.6.14-rc5

Do I read this correctly: according to your benchmarks, fileio-noop and
fileio-cfq are down some 20% or more, across all machine configurations,
since 2.6.9? In the 4P configuration, dbench-{noop,as} both seem to have
regressed as well.

Jeff


2005-10-29 00:04:12

by Chen, Kenneth W

[permalink] [raw]
Subject: RE: kernel performance update - 2.6.14

Jeff Garzik wrote on Friday, October 28, 2005 4:54 PM
> Chen, Kenneth W wrote:
> > Kernel performance data for 2.6.14 (released yesterday) is updated at:
> > http://kernel-perf.sourceforge.net
> >
> > As expected, results are within run variation compares to 2.6.14-rc5.
> > No significant deviation found compare to 2.6.14-rc5
>
> Do I read this correctly: according to your benchmarks, fileio-noop and
> fileio-cfq are down some 20% or more, across all machine configurations,
> since 2.6.9? In the 4P configuration, dbench-{noop,as} both seem to have
> regressed as well.

Yes, you did read that correctly. For some benchmarks, these numbers
can't be directly interpreted as a regression since we may not have
the correct configuration (either wrong setup or default parameters
aren't suitable for the machine that size). For example, dbench is
one of them. We are working on characterize these workloads to make
sure our setup is meaningful.

We are also looking through fileio data. I think we understand the drop
in fileio-noop. But I want to have more definitive understanding before
claim a real regression. Same thing with cfq.

- Ken

2005-10-29 00:05:49

by jmerkey

[permalink] [raw]
Subject: Re: kernel performance update - 2.6.14


Verified. These numbers reflect my measurements as well. I have not
moved off 2.6.9 to newer kernels on shipping products due to these
issues. There are also serious stability issues as well, though 2.6.14
seems a little better than than previous kernels.

Jeff



Jeff Garzik wrote:

> Chen, Kenneth W wrote:
>
>> Kernel performance data for 2.6.14 (released yesterday) is updated at:
>> http://kernel-perf.sourceforge.net
>>
>> As expected, results are within run variation compares to 2.6.14-rc5.
>> No significant deviation found compare to 2.6.14-rc5
>
>
> Do I read this correctly: according to your benchmarks, fileio-noop
> and fileio-cfq are down some 20% or more, across all machine
> configurations, since 2.6.9? In the 4P configuration, dbench-{noop,as}
> both seem to have regressed as well.
>
> Jeff
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2005-10-29 00:19:11

by Felix Oxley

[permalink] [raw]
Subject: Re: kernel performance update - 2.6.14

Chen, Kenneth W wrote:
> Kernel performance data for 2.6.14 (released yesterday) is updated at:
> http://kernel-perf.sourceforge.net
>
> As expected, results are within run variation compares to 2.6.14-rc5.
> No significant deviation found compare to 2.6.14-rc5
>

There seems to be some regression here:

System: 4P Xeon
Test:Result Group 8
Metric: 64KB_4_fread
Result: +1.9% -15%
Kernel: 2.6.14-rc4 vs 2.6.14-rc4-git4

System: 2P Xeon
Test:Result Group 7
Metric: ODIRECT
Kernel: 2.6.14-rc5 vs 2.6.14-rc5-git3
Summary: Write has increased whereas Read has decreased by 4-5 %


Any thoughts?

regards,
Felix


2005-10-29 00:29:13

by Felix Oxley

[permalink] [raw]
Subject: Re: kernel performance update - 2.6.14

Felix Oxley wrote:
> Chen, Kenneth W wrote:
> > Kernel performance data for 2.6.14 (released yesterday) is updated at:
> > http://kernel-perf.sourceforge.net
> >
> > As expected, results are within run variation compares to 2.6.14-rc5.
> > No significant deviation found compare to 2.6.14-rc5
> >
>
> There seems to be some regression here:
>
> System: 4P Xeon
> Test:Result Group 8
> Metric: 64KB_4_fread
> Result: +1.9% -15%
> Kernel: 2.6.14-rc4 vs 2.6.14-rc4-git4
>
> System: 2P Xeon
> Test:Result Group 7
> Metric: ODIRECT
> Kernel: 2.6.14-rc5 vs 2.6.14-rc5-git3
> Summary: Write has increased whereas Read has decreased by 4-5 %
>
>

Something went horribly wrong with this test between 2.6.13 and
2.6.13-git2 (it has never recovered):

System: 4P Itanium
Test:Result Group 1
Metric: VolcanoMark
Result: -3% -10%
Kernel: 2.6.13 vs 2.6.13-git2

Does anybody know the cause of this?

regards,
Felix

2005-10-29 00:29:20

by Chen, Kenneth W

[permalink] [raw]
Subject: RE: kernel performance update - 2.6.14

Felix Oxley wrote on Friday, October 28, 2005 5:19 PM
> Chen, Kenneth W wrote:
> > Kernel performance data for 2.6.14 (released yesterday) is updated at:
> > http://kernel-perf.sourceforge.net
> >
> > As expected, results are within run variation compares to 2.6.14-rc5.
> > No significant deviation found compare to 2.6.14-rc5
> >
>
> There seems to be some regression here:
>
> System: 4P Xeon
> Test:Result Group 8
> Metric: 64KB_4_fread
> Result: +1.9% -15%
> Kernel: 2.6.14-rc4 vs 2.6.14-rc4-git4
>
> System: 2P Xeon
> Test:Result Group 7
> Metric: ODIRECT
> Kernel: 2.6.14-rc5 vs 2.6.14-rc5-git3
> Summary: Write has increased whereas Read has decreased by 4-5 %
>
> Any thoughts?

Not on top of my head at the moment. These are iozone workload, we
will investigate these.

- Ken

2005-10-29 00:42:59

by Chen, Kenneth W

[permalink] [raw]
Subject: RE: kernel performance update - 2.6.14

Felix Oxley wrote on Friday, October 28, 2005 5:29 PM
> Something went horribly wrong with this test between 2.6.13 and
> 2.6.13-git2 (it has never recovered):
>
> System: 4P Itanium
> Test:Result Group 1
> Metric: VolcanoMark
> Result: -3% -10%
> Kernel: 2.6.13 vs 2.6.13-git2
>
> Does anybody know the cause of this?

Search the archive, it was discussed here:
http://marc.theaimsgroup.com/?l=linux-ia64&m=112683124124723&w=2


It is not because of changes in 2.6.13-git2. It would've shown on
2.6.13-rc1 when default hz rate was switched to 250. I happened
to audit the system at that time and made the hz switch (from 1000
to 250 and the problem showed up.

More discussion here:
http://marc.theaimsgroup.com/?l=linux-kernel&m=112854723926854&w=2


- Ken

2005-10-29 02:47:41

by Nick Piggin

[permalink] [raw]
Subject: Re: kernel performance update - 2.6.14

Jeff V. Merkey wrote:
>
> Verified. These numbers reflect my measurements as well. I have not
> moved off 2.6.9 to newer kernels on shipping products due to these
> issues. There are also serious stability issues as well, though 2.6.14
> seems a little better than than previous kernels.
> Jeff
>

These issues aren't going to fix themselves. Did you investigate
any of the performance or (more importantly) stability problems?

--
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com

2005-10-29 04:32:43

by jmerkey

[permalink] [raw]
Subject: Re: kernel performance update - 2.6.14

Nick Piggin wrote:

> Jeff V. Merkey wrote:
>
>>
>> Verified. These numbers reflect my measurements as well. I have not
>> moved off 2.6.9 to newer kernels on shipping products due to these
>> issues. There are also serious stability issues as well, though
>> 2.6.14 seems a little better than than previous kernels. Jeff
>>
>
> These issues aren't going to fix themselves. Did you investigate
> any of the performance or (more importantly) stability problems?

Yes I did. The list wasn't too long. I had problems with RCU messages
and irq warn messages at very high loads and init respawning itself
subjected to loads > 369 MB/S to the disk channels on 2.6.13.
Performance was down on disk I/O 2.6.9. I did not investigate the BIO
fixes but something changed there. Theres also some memory problems with
corruption somewhere in the 2.6.14.

Jeff

2005-10-29 04:35:13

by jmerkey

[permalink] [raw]
Subject: Re: kernel performance update - 2.6.14

jmerkey wrote:

> Nick Piggin wrote:
>
>> Jeff V. Merkey wrote:
>>
>>>
>>> Verified. These numbers reflect my measurements as well. I have not
>>> moved off 2.6.9 to newer kernels on shipping products due to these
>>> issues. There are also serious stability issues as well, though
>>> 2.6.14 seems a little better than than previous kernels. Jeff
>>>
>>
>> These issues aren't going to fix themselves. Did you investigate
>> any of the performance or (more importantly) stability problems?
>
>
Added a little more clarification.

Jeff

> Yes I did. The list wasn't too long. I had problems with RCU messages
> and irq warn messages at very high loads and init respawning itself
> subjected to loads > 369 MB/S to the disk channels on 2.6.13.
> Performance was down on disk I/O [vs.] 2.6.9. I did not investigate
> the BIO fixes but something changed there. Theres also some memory
> problems with corruption somewhere in the 2.6.14 (during module unload
> and shutdown).
>
> Jeff
>

2005-10-29 04:42:39

by jmerkey

[permalink] [raw]
Subject: Re: kernel performance update - 2.6.14



And one other item. Setting the build to preemptible kernel seems to
improve I/O performance relative to 2.6.9, if you don't use it, the
console has long periods where user processes are getting starved under
extremely heavy I/O loads.

Jeff


jmerkey wrote:

> jmerkey wrote:
>
>> Nick Piggin wrote:
>>
>>> Jeff V. Merkey wrote:
>>>
>>>>
>>>> Verified. These numbers reflect my measurements as well. I have not
>>>> moved off 2.6.9 to newer kernels on shipping products due to these
>>>> issues. There are also serious stability issues as well, though
>>>> 2.6.14 seems a little better than than previous kernels. Jeff
>>>>
>>>
>>> These issues aren't going to fix themselves. Did you investigate
>>> any of the performance or (more importantly) stability problems?
>>
>>
>>
> Added a little more clarification.
>
> Jeff
>
>> Yes I did. The list wasn't too long. I had problems with RCU messages
>> and irq warn messages at very high loads and init respawning itself
>> subjected to loads > 369 MB/S to the disk channels on 2.6.13.
>> Performance was down on disk I/O [vs.] 2.6.9. I did not investigate
>> the BIO fixes but something changed there. Theres also some memory
>> problems with corruption somewhere in the 2.6.14 (during module
>> unload and shutdown).
>>
>> Jeff
>>
>
> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2005-10-29 08:36:48

by Nick Piggin

[permalink] [raw]
Subject: Re: kernel performance update - 2.6.14

jmerkey wrote:
> jmerkey wrote:

>> Yes I did. The list wasn't too long. I had problems with RCU messages
>> and irq warn messages at very high loads and init respawning itself
>> subjected to loads > 369 MB/S to the disk channels on 2.6.13.
>> Performance was down on disk I/O [vs.] 2.6.9. I did not investigate
>> the BIO fixes but something changed there. Theres also some memory
>> problems with corruption somewhere in the 2.6.14 (during module unload
>> and shutdown).
>>

Well that doesn't sound too good. It would be good if you could document
and report each problem - the messages, workload, kernel config and any
patches used, etc. And post them to lkml. Hopefully they can get sorted
out.

Thanks,
Nick

--
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com

2005-10-29 11:17:50

by Dr. David Alan Gilbert

[permalink] [raw]
Subject: Re: kernel performance update - 2.6.14

Hi,
Have you any idea what happened to cpu-fp on the Dual Xeons?
Accross the other 2 platforms it is pretty stable, but cpu-fp
seems to have gone up a few % around 2.6.13-rc7 and dropped back
down around 2.6.1.4-rc2 and 2.6.14-rc4; now it isn't any lower
than where you started - but it would be nice to have whatever
it was that got that extra few %.

Dave
--
-----Open up your eyes, open up your mind, open up your code -------
/ Dr. David Alan Gilbert | Running GNU/Linux on Alpha,68K| Happy \
\ gro.gilbert @ treblig.org | MIPS,x86,ARM,SPARC,PPC & HPPA | In Hex /
\ _________________________|_____ http://www.treblig.org |_______/

2005-11-08 00:05:03

by Chen, Tim C

[permalink] [raw]
Subject: RE: kernel performance update - 2.6.14


> Felix Oxley wrote on Friday, October 28, 2005 5:29 PM
>> Something went horribly wrong with this test between 2.6.13 and
>> 2.6.13-git2 (it has never recovered):
>>
>> System: 4P Itanium
>> Test:Result Group 1
>> Metric: VolcanoMark
>> Result: -3% -10%
>> Kernel: 2.6.13 vs 2.6.13-git2
>>
>> Does anybody know the cause of this?
>
> Search the archive, it was discussed here:
> http://marc.theaimsgroup.com/?l=linux-ia64&m=112683124124723&w=2
>
>
> It is not because of changes in 2.6.13-git2. It would've shown on
> 2.6.13-rc1 when default hz rate was switched to 250. I happened to
> audit the system at that time and made the hz switch (from 1000 to
> 250 and the problem showed up.
>
> More discussion here:
> http://marc.theaimsgroup.com/?l=linux-kernel&m=112854723926854&w=2
>
>
> - Ken

For 4P Itanium, it turns out that Volanaomark is operating
suboptimally with system idle at 55% for the region of operations
where regression occurs for Hz changes. Volanomark server broadcasts
short message (each only ~40 bytes) from each client to the other
clients in the chatroom (20 clients/chatroom in test). However each
message consumes a sk_buff taking up 1 page (16K on Itanium) of
memory as the message is sent immediately without coalescing with
other messages. We hit our default write buffer size limit
tcp_wmem_max (128K) quickly with 8 outstanding packets. The server
stalls waiting for acknowledgement. Due to TCP's delayed
acknowledgement, the server do not get ack immediately to continue.
By either increasing the TCP write buffer size or patching the kernel
to send TCP ACK without delay, we can reduce the system idle to 0%
and Volanomark performance do not show regression in this case when
Hz rate changes.

- Tim