2005-10-05 21:15:44

by Chen, Kenneth W

[permalink] [raw]
Subject: kernel performance update - 2.6.14-rc3

Kernel performance data for 2.6.14-rc3 is updated at:
http://kernel-perf.sourceforge.net

We are continuing our investigation with volanomark regression
seen with HZ rate reduced from default 1K to 250. The workload
is run with loopback interface. Preliminarily, we think it is
related to how softirq is invoked. Multiple threads are usually
blocked waiting on incoming socket data (sleep side kernel via
sk_wait_data function). It needs an external event (i.e., NIC
receiving a packet over the wire and subsequence hw interrupt) to
trigger a thread wakeup. However, with software loopback device,
the link between the xmit and rcv is done via softirq. Even though
softirq is invoked at the end of dev_queue_xmit() via local_bh_enable(),
not all execution of softirq will result a __wake_up(). With higher
HZ rate, timer interrupt is more frequent and thus more softirq
invocation and leads to more __wake_up(), which then takes us to higher
throughput because cpu spend less time in idle. We are continuing
with more experiments to follow up.

dbench is catching some attention. We just ran it with default
parameter. I don't think default parameter is the right one to use
on some of our configurations. For example, it shows +100% improvement
on 4P Xeon between latest kernel and 2.6.9, while showing -45% on 4P ia64.
It just doesn't make much sense to me. Does any expert out there have
recommendation What are the proper parameter to use for this workload?
Same thing goes to tbench (what is the proper parameter to use here?).

- Ken


2005-10-06 06:16:18

by Arjan van de Ven

[permalink] [raw]
Subject: Re: kernel performance update - 2.6.14-rc3


> dbench is catching some attention. We just ran it with default
> parameter. I don't think default parameter is the right one to use
> on some of our configurations. For example, it shows +100% improvement

never ever consider dbench a serious benchmark; the thing is you can
make dbench a lot better very easy; just make the kernel run one thread
at a time until completion. dbench really gives very variable results,
but it is not really possible to say if +100% or -100% is an improvement
or a degredation for real life. So please just don't run it, or at least
don't interpret the results in a "higher is better" way.

2005-10-06 10:05:20

by Nick Piggin

[permalink] [raw]
Subject: Re: kernel performance update - 2.6.14-rc3

Arjan van de Ven wrote:
>>dbench is catching some attention. We just ran it with default
>>parameter. I don't think default parameter is the right one to use
>>on some of our configurations. For example, it shows +100% improvement
>
>
> never ever consider dbench a serious benchmark; the thing is you can
> make dbench a lot better very easy; just make the kernel run one thread
> at a time until completion. dbench really gives very variable results,
> but it is not really possible to say if +100% or -100% is an improvement
> or a degredation for real life. So please just don't run it, or at least
> don't interpret the results in a "higher is better" way.
>

As a disk IO performance benchmark you are absolutely right.

Some people like using it to test VM scalability and throughput
if it is being used on tmpfs. In that case the results are
generally more stable.

--
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com

2005-10-06 10:32:15

by Andi Kleen

[permalink] [raw]
Subject: Re: kernel performance update - 2.6.14-rc3

"Chen, Kenneth W" <[email protected]> writes:

> Even though
> softirq is invoked at the end of dev_queue_xmit() via local_bh_enable(),
> not all execution of softirq will result a __wake_up(). With higher
> HZ rate, timer interrupt is more frequent and thus more softirq
> invocation and leads to more __wake_up(), which then takes us to higher
> throughput because cpu spend less time in idle.

This sounds like a serious bug somewhere if true.

-Andi

2005-10-06 18:36:54

by Chen, Kenneth W

[permalink] [raw]
Subject: RE: kernel performance update - 2.6.14-rc3

Nick Piggin wrote on Thursday, October 06, 2005 3:04 AM
> Arjan van de Ven wrote:
> >>dbench is catching some attention. We just ran it with default
> >>parameter. I don't think default parameter is the right one to use
> >>on some of our configurations. For example, it shows +100% improvement
> >
> > never ever consider dbench a serious benchmark; the thing is you can
> > make dbench a lot better very easy; just make the kernel run one thread
> > at a time until completion. dbench really gives very variable results,
> > but it is not really possible to say if +100% or -100% is an improvement
> > or a degredation for real life. So please just don't run it, or at least
> > don't interpret the results in a "higher is better" way.
> >
>
> As a disk IO performance benchmark you are absolutely right.
>
> Some people like using it to test VM scalability and throughput
> if it is being used on tmpfs. In that case the results are
> generally more stable.

Thank you for the suggestion, we will look into the options. I agree here
as well, and I also don't consider dbench as a serious disk I/O performance
benchmark. There are other workloads that we ran (IOzone, aiostress, and my
favorite "industry standard database workload") which covers disk I/O side
pretty well.

- Ken

2005-10-06 19:30:26

by Sonny Rao

[permalink] [raw]
Subject: Re: kernel performance update - 2.6.14-rc3

On Thu, Oct 06, 2005 at 11:36:47AM -0700, Chen, Kenneth W wrote:
> Nick Piggin wrote on Thursday, October 06, 2005 3:04 AM
> > Arjan van de Ven wrote:
> > >>dbench is catching some attention. We just ran it with default
> > >>parameter. I don't think default parameter is the right one to use
> > >>on some of our configurations. For example, it shows +100% improvement
> > >
> > > never ever consider dbench a serious benchmark; the thing is you can
> > > make dbench a lot better very easy; just make the kernel run one thread
> > > at a time until completion. dbench really gives very variable results,
> > > but it is not really possible to say if +100% or -100% is an improvement
> > > or a degredation for real life. So please just don't run it, or at least
> > > don't interpret the results in a "higher is better" way.
> > >
> >
> > As a disk IO performance benchmark you are absolutely right.
> >
> > Some people like using it to test VM scalability and throughput
> > if it is being used on tmpfs. In that case the results are
> > generally more stable.
>
> Thank you for the suggestion, we will look into the options. I agree here
> as well, and I also don't consider dbench as a serious disk I/O performance
> benchmark. There are other workloads that we ran (IOzone, aiostress, and my
> favorite "industry standard database workload") which covers disk I/O side
> pretty well.

Also, make sure you are running dbench version 3, it has much much
better reproducibility and self-consistency than version 2.

Sonny

2005-10-06 19:32:34

by Chen, Kenneth W

[permalink] [raw]
Subject: RE: kernel performance update - 2.6.14-rc3

Sonny Rao wrote on Thursday, October 06, 2005 12:24 PM
> Also, make sure you are running dbench version 3, it has much much
> better reproducibility and self-consistency than version 2.

Yes, we are running dbench-3.03.

- Ken

2005-10-08 00:50:12

by Jon Burgess

[permalink] [raw]
Subject: Re: kernel performance update - 2.6.14-rc3

"Chen, Kenneth W" <[email protected]> writes:

> Even though
> softirq is invoked at the end of dev_queue_xmit() via local_bh_enable(),
> not all execution of softirq will result a __wake_up(). With higher
> HZ rate, timer interrupt is more frequent and thus more softirq
> invocation and leads to more __wake_up(), which then takes us to higher
> throughput because cpu spend less time in idle.

Since the loopback xmit->rx path probably isn't being called in
interrupt context might something like the patch below be needed?

Please forgive me if this is wrong, i've not even tried compiling this
change let alone tested it.

Jon



Attachments:
loopback-netif_rx.patch (314.00 B)

2005-10-08 01:38:33

by Chen, Kenneth W

[permalink] [raw]
Subject: RE: kernel performance update - 2.6.14-rc3

Jon Burgess wrote on Friday, October 07, 2005 5:50 PM
> "Chen, Kenneth W" <[email protected]> writes:
> > Even though
> > softirq is invoked at the end of dev_queue_xmit() via local_bh_enable(),
> > not all execution of softirq will result a __wake_up(). With higher
> > HZ rate, timer interrupt is more frequent and thus more softirq
> > invocation and leads to more __wake_up(), which then takes us to higher
> > throughput because cpu spend less time in idle.
>
> Since the loopback xmit->rx path probably isn't being called in
> interrupt context might something like the patch below be needed?
>
> Please forgive me if this is wrong, i've not even tried compiling
> this change let alone tested it.

I don't think this patch has any effect. dev_queue_xmit turns off
local_bh when calling loopback_xmit, so calls to do_softirq from
netif_rx_ni Will be a noop since do_softirq immediately return upon
seeing in_interrupt().

- Ken