2009-06-02 14:24:37

by Chris Mason

[permalink] [raw]
Subject: Re: [benchmark] 1% performance overhead of paravirt_ops on native kernels

On Sat, May 30, 2009 at 12:23:30PM +0200, Ingo Molnar wrote:
>
> * Nick Piggin <[email protected]> wrote:
>
> > FWIW, we had to disable paravirt in our default SLES11 kernel.
> > (admittedly this was before some of the recent improvements were
> > made). But there are only so many 1% performance regressions you
> > can introduce before customers won't upgrade (or vendors won't
> > publish benchmarks with the new software).
> >
> > But OTOH, almost any bit feature is going to cost performance. I don't
> > think this is something new (as noted with CONFIG_SECURITY). [...]
>
> Yes in a way, but the difference is that:
>
> - i noted CONFIG_SECURITY as the _worst current example_. It is the
> largest-overhead feature known to me in this area, and i
> benchmark the kernel a lot. CONFIG_PARAVIRT has _even more_
> overhead so it takes the (dubious) top spot in this category.
>
> - CONFIG_SECURITY, in the distros where it's enabled (most of them)
> it is actually being relied on by the default user-space. It's
> being configured and every default install of the distro has a
> real (or at least perceived) advantage from it.
>
> Not so with CONFIG_PARAVIRT. That feature is almost fully parasitic
> to native environments: currently it brings no advantage on native
> hardware _at all_ (and 95% of the users dont care about Xen).
>
> Still it's impractical for a distro to disable it because adding a
> separate kernel package has high costs too and PARAVIRT=y is needed
> for the weird execution environment that Xen requires to run Linux
> with acceptable speed.

I think all of the distros expect mainline to do a difficult balancing
act. On the one hand we don't want any performance regressions, and on
the other hand, we need to be able to work with mainline to get the
major features that we're shipping and using every day in.

What we have here is a CONFIG_ that would be heavily used if it were
included. It may be used as part of a separate kernel build, or it may
be used for a single kernel image. Either way, this is something for
the distros to decide. You're arguing to leave it out of mainline
because making a separate kernel package for it is too hard, but by
doing so forcing them to maintain the patches out of tree. They still
have the same decision about the second package either way.

The xen developers are not dropping this code on mainline and running
away to spend the rest of their lives on the beach. This is is a step
in long term development, and keeping it out of the kernel is making
testing and continued development much harder on the users.

I'm not suggesting we should take broken code, or that we should lower
standards just for xen. But, expecting the xen developers to fix the 1%
hit on a very specific micro-benchmark is not a way to promote new
projects for the kernel, and it isn't a good way to convince people to
do continued development in mainline instead of in private trees.

Please reconsider. Keeping these patches out is only making it harder
on the people that want to make them better.

-chris


2009-06-02 14:57:20

by Ulrich Drepper

[permalink] [raw]
Subject: Re: [benchmark] 1% performance overhead of paravirt_ops on native kernels

On Tue, Jun 2, 2009 at 7:18 AM, Chris Mason <[email protected]> wrote:
> I'm not suggesting we should take broken code, or that we should lower
> standards just for xen.  But, expecting the xen developers to fix the 1%
> hit on a very specific micro-benchmark is not a way to promote new
> projects for the kernel, and it isn't a good way to convince people to
> do continued development in mainline instead of in private trees.

It's not a new project which needs to be treated with kid's gloves.
And one be sure that once the code is in the tree those interested
parties will not be as strongly motivated to fix any problem like
this. Ingo pointed to a way which doesn't negatively impact the
performance of the Xen kernel and reduces the overhead (dynamic
patching). Just get started on this (and general cleanup) and this
whole argument will go away.

I find it ridiculous to use the "but it's used" argument to try to
force the code into the kernel. By this argument you can say the same
about crap like ndiswrapper and similarly harmful code.

2009-06-02 15:10:25

by Chris Mason

[permalink] [raw]
Subject: Re: [benchmark] 1% performance overhead of paravirt_ops on native kernels

On Tue, Jun 02, 2009 at 07:49:32AM -0700, Ulrich Drepper wrote:
> On Tue, Jun 2, 2009 at 7:18 AM, Chris Mason <[email protected]> wrote:
> > I'm not suggesting we should take broken code, or that we should lower
> > standards just for xen. ?But, expecting the xen developers to fix the 1%
> > hit on a very specific micro-benchmark is not a way to promote new
> > projects for the kernel, and it isn't a good way to convince people to
> > do continued development in mainline instead of in private trees.
>
> It's not a new project which needs to be treated with kid's gloves.

Sure, I'm not suggesting we send them flowers on mothers day or
anything, and I'm not suggesting they skip out on important cleanups.
But, I strongly object to a 1% hit on a random micro benchmark as a
reason to keep the code out.

> And one be sure that once the code is in the tree those interested
> parties will not be as strongly motivated to fix any problem like
> this.

The idea that people shipping xen aren't interested in performance
regressions is really strange to me.

> Ingo pointed to a way which doesn't negatively impact the
> performance of the Xen kernel and reduces the overhead (dynamic
> patching). Just get started on this (and general cleanup) and this
> whole argument will go away.

Dynamic patching is a big wad of duct tape over the problem.

>
> I find it ridiculous to use the "but it's used" argument to try to
> force the code into the kernel. By this argument you can say the same
> about crap like ndiswrapper and similarly harmful code.

I'm not saying to take harmful code, I'm saying to take code with a
small performance regression under a specific CONFIG_. Slub regresses
more than 1% on database loads, CONFIG_SCHED_GROUPS, the list goes on
and on.

The idea that we should take code that is heavily used is important.
The best place to fix xen is in the kernel. It always has been, and
keeping it out is just making it harder on everyone involved.

-chris

2009-06-02 15:23:12

by Ulrich Drepper

[permalink] [raw]
Subject: Re: [benchmark] 1% performance overhead of paravirt_ops on native kernels

On Tue, Jun 2, 2009 at 8:03 AM, Chris Mason <[email protected]> wrote:
> The idea that people shipping xen aren't interested in performance
> regressions is really strange to me.

Why? They have a different base line. For them any regression to
bare hardware performance is even a positive (since it means the gap
between hardware and virt shrinks).


> Dynamic patching is a big wad of duct tape over the problem.

And what do you call the Xen model? It's a perfect fit IMO.


> I'm not saying to take harmful code, I'm saying to take code with a
> small performance regression under a specific CONFIG_.  Slub regresses
> more than 1% on database loads, CONFIG_SCHED_GROUPS, the list goes on
> and on.

None of those have to be enabled in default kernels.


> The best place to fix xen is in the kernel.

No. The best way to fix things is _on the way into the kernel_.

2009-06-02 16:27:18

by Chris Mason

[permalink] [raw]
Subject: Re: [benchmark] 1% performance overhead of paravirt_ops on native kernels

On Tue, Jun 02, 2009 at 08:22:57AM -0700, Ulrich Drepper wrote:
> On Tue, Jun 2, 2009 at 8:03 AM, Chris Mason <[email protected]> wrote:
> > The idea that people shipping xen aren't interested in performance
> > regressions is really strange to me.
>
> Why? They have a different base line. For them any regression to
> bare hardware performance is even a positive (since it means the gap
> between hardware and virt shrinks).

And we would have gotten away with it too if it weren't for you meddling
kids!

>
>
> > Dynamic patching is a big wad of duct tape over the problem.
>
> And what do you call the Xen model? It's a perfect fit IMO.
>
> > I'm not saying to take harmful code, I'm saying to take code with a
> > small performance regression under a specific CONFIG_. ?Slub regresses
> > more than 1% on database loads, CONFIG_SCHED_GROUPS, the list goes on
> > and on.
>
> None of those have to be enabled in default kernels.
>
>
> > The best place to fix xen is in the kernel.
>
> No. The best way to fix things is _on the way into the kernel_.

It all depends on which parts are causing problems. A 1% performance
hit, under a CONFIG_ that can be disabled? If maintainers are focusing
on details like this for long term and active projects, we're doing
something very wrong.

-chris

2009-06-02 18:06:49

by Pekka Enberg

[permalink] [raw]
Subject: Re: [benchmark] 1% performance overhead of paravirt_ops on native kernels

Hi Chris,

On Tue, Jun 2, 2009 at 6:03 PM, Chris Mason <[email protected]> wrote:
>> I find it ridiculous to use the "but it's used" argument to try to
>> force the code into the kernel. ?By this argument you can say the same
>> about crap like ndiswrapper and similarly harmful code.
>
> I'm not saying to take harmful code, I'm saying to take code with a
> small performance regression under a specific CONFIG_. ?Slub regresses
> more than 1% on database loads, CONFIG_SCHED_GROUPS, the list goes on
> and on.

Maybe it's just me but you make it sound like the SLUB regression is
okay. It's not.

Unfortunately we're now in a position where we can't just remove SLUB
(it's an improvement over SLAB for NUMA) so we're stuck with two
allocators with third one on its way to the kernel. So yes, it makes a
lot of sense to me to fix CONFIG_PARAVIRT regression before merging
more of the xen stuff in the kernel. It's always easier to fix these
things before they hit the kernel and people start to depend on them.

Pekka

2009-06-02 18:13:45

by Pekka Enberg

[permalink] [raw]
Subject: Re: [benchmark] 1% performance overhead of paravirt_ops on native kernels

Hi Chris,

On Tue, Jun 2, 2009 at 8:03 AM, Chris Mason <[email protected]> wrote:
>>> The best place to fix xen is in the kernel.

On Tue, Jun 02, 2009 at 08:22:57AM -0700, Ulrich Drepper wrote:
>> No. ?The best way to fix things is _on the way into the kernel_.

On Tue, Jun 2, 2009 at 7:20 PM, Chris Mason <[email protected]> wrote:
> It all depends on which parts are causing problems. ?A 1% performance
> hit, under a CONFIG_ that can be disabled? ?If maintainers are focusing
> on details like this for long term and active projects, we're doing
> something very wrong.

The fact that CONFIG_PARAVIRT can be disabled doesn't really help. As
a matter of fact, I'd argue that one of the primary reasons
CONFIG_SLUB regression is still there is because people can just
disable it and use CONFIG_SLAB instead.

So I think we have some evidence to suggest that people have less
incentive to fix things once something is merged to the kernel. And I
don't mean the authors of the code here but basically _everyone_
involved in kernel development. It usually takes effort from variety
of people to get everything ironed out because, lets face it, we can't
expect a handful of people to test out every configuration let alone
fix them.

Pekka

2009-06-02 18:33:56

by Chris Mason

[permalink] [raw]
Subject: Re: [benchmark] 1% performance overhead of paravirt_ops on native kernels

On Tue, Jun 02, 2009 at 09:06:41PM +0300, Pekka Enberg wrote:
> Hi Chris,
>
> On Tue, Jun 2, 2009 at 6:03 PM, Chris Mason <[email protected]> wrote:
> >> I find it ridiculous to use the "but it's used" argument to try to
> >> force the code into the kernel. ?By this argument you can say the same
> >> about crap like ndiswrapper and similarly harmful code.
> >
> > I'm not saying to take harmful code, I'm saying to take code with a
> > small performance regression under a specific CONFIG_. ?Slub regresses
> > more than 1% on database loads, CONFIG_SCHED_GROUPS, the list goes on
> > and on.
>
> Maybe it's just me but you make it sound like the SLUB regression is
> okay. It's not.

Well, it is and it isn't. SLUB was implemented with specific workloads
in mind. I'd prefer that regressions not get in at all, but sometimes
it takes the broad exposure you get from being in mainline to
finish things. Sometimes we finish things with rm, but without slub I
don't think the issues it was trying to solve would have been discussed
at all.

>
> Unfortunately we're now in a position where we can't just remove SLUB
> (it's an improvement over SLAB for NUMA) so we're stuck with two
> allocators with third one on its way to the kernel. So yes, it makes a
> lot of sense to me to fix CONFIG_PARAVIRT regression before merging
> more of the xen stuff in the kernel. It's always easier to fix these
> things before they hit the kernel and people start to depend on them.

The problem is that people already depend on them ;) If people want to
nack xen based on code structure, that's more than fair, I just hope we
can keep the discussion around something the xen developers can work
toward.

Micro benchmarks come and go, we tune as best we can based on the
tradeoffs at hand.

-chris

2009-06-02 19:21:38

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [benchmark] 1% performance overhead of paravirt_ops on native kernels

On Tue, 2 Jun 2009, Chris Mason wrote:
> I'm not suggesting we should take broken code, or that we should lower
> standards just for xen. But, expecting the xen developers to fix the 1%
> hit on a very specific micro-benchmark is not a way to promote new
> projects for the kernel, and it isn't a good way to convince people to
> do continued development in mainline instead of in private trees.
>
> Please reconsider. Keeping these patches out is only making it harder
> on the people that want to make them better.

You are missing one subtle point.

I read several times, that A, B and C can not be changed design wise
to allow newer kernels to run on older hypervisors. That's what
frightens me:

dom0 imposes a kind of ABI which we can not change anymore.

So where is the room for the improvements which you expect when dom0
is merged ? It's not about micro benchmark results, it's about the
inability to fix the existing design decisions in the near future.

You can change the internals of btrfs as often as you want, but you
can not change the on disk layout at will. And while you can invent
btrfs2 w/o any impact aside of grumpy users and a couple of thousand
lines self contained code, dom0v2 would just add a different layer of
intrusiveness into the x86 code base w/o removing the existing one.

Thanks,

tglx

2009-06-02 19:58:41

by Chris Mason

[permalink] [raw]
Subject: Re: [benchmark] 1% performance overhead of paravirt_ops on native kernels

On Tue, Jun 02, 2009 at 09:14:23PM +0200, Thomas Gleixner wrote:
> On Tue, 2 Jun 2009, Chris Mason wrote:
> > I'm not suggesting we should take broken code, or that we should lower
> > standards just for xen. But, expecting the xen developers to fix the 1%
> > hit on a very specific micro-benchmark is not a way to promote new
> > projects for the kernel, and it isn't a good way to convince people to
> > do continued development in mainline instead of in private trees.
> >
> > Please reconsider. Keeping these patches out is only making it harder
> > on the people that want to make them better.
>
> You are missing one subtle point.
>

I'm sure I'm missing many more than one ;)

> I read several times, that A, B and C can not be changed design wise
> to allow newer kernels to run on older hypervisors. That's what
> frightens me:
>
> dom0 imposes a kind of ABI which we can not change anymore.
>
> So where is the room for the improvements which you expect when dom0
> is merged ? It's not about micro benchmark results, it's about the
> inability to fix the existing design decisions in the near future.
>
> You can change the internals of btrfs as often as you want, but you
> can not change the on disk layout at will. And while you can invent
> btrfs2 w/o any impact aside of grumpy users and a couple of thousand
> lines self contained code, dom0v2 would just add a different layer of
> intrusiveness into the x86 code base w/o removing the existing one.

Well, if there's a line we want to draw in the sand based on firm and
debatable criteria, great.

The problem I see here is that our line in the sand for the xen
developers is fuzzy and winding (yeah, I saw Linus' reply in the other
thread, full ack on that).

-chris

2009-06-03 06:33:35

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [benchmark] 1% performance overhead of paravirt_ops on native kernels

Ulrich Drepper wrote:
> Ingo pointed to a way which doesn't negatively impact the
> performance of the Xen kernel and reduces the overhead (dynamic
> patching)

The pvops code is already fully dynamically patched, which replaces all
the indirect calls with either direct calls, inline instructions or
nops. It has been this way from the initial implementation.

More recently I changed the calling convention on some of the most
common critical-path ops to reduce the register pressure caused by the
function call clobbers; you just don't need a pile of registers to
disable interrupts.

Ingo knows all this, so I'm not sure what further patching he's
suggesting. I don't see any more likely candidates, but I'm open to
suggestions.

J