LinuxLists.cc - [PATCH 1/4, v2] x86: enlightenment for ticket spin locks

[permalink] [raw]

Subject: Re: [PATCH 1/4, v2] x86: enlightenment for ticket spin locks - base implementation

On Tue, 2010-06-29 at 15:31 +0100, Jan Beulich wrote:
> Add optional (alternative instructions based) callout hooks to the
> contended ticket lock and the ticket unlock paths, to allow hypervisor
> specific code to be used for reducing/eliminating the bad effects
> ticket locks have on performance when running virtualized.

Uhm, I'd much rather see a single alternative implementation, not a
per-hypervisor lock implementation.

> For the moment, this isn't intended to be used together with pv-ops,
> but this is just to simplify initial integration. The ultimate goal
> for this should still be to replace pv-ops spinlocks.

So why not start by removing that?

> +config ENLIGHTEN_SPINLOCKS

Why exactly are these enlightened? I'd say CONFIG_UNFAIR_SPINLOCKS would
be much better.

> +#define X86_FEATURE_SPINLOCK_YIELD (3*32+31) /* hypervisor yield interface */

That name also sucks chunks, yield isn't a lock related term.

> +#define ALTERNATIVE_TICKET_LOCK \

But but but, the alternative isn't a ticket lock..!?

2010-06-30 08:24:36

[permalink] [raw]

Subject: Re: [PATCH 1/4, v2] x86: enlightenment for ticket spin locks - base implementation

On Tue, 2010-06-29 at 15:31 +0100, Jan Beulich wrote:
> @@ -62,18 +110,20 @@ static __always_inline void __ticket_spi
> {
> short inc = 0x0100;
>
> - asm volatile (
> + alternative_common(
> LOCK_PREFIX "xaddw %w0, %1\n"
> "1:\t"
> "cmpb %h0, %b0\n\t"
> - "je 2f\n\t"
> + "je 2f\n\t",
> "rep ; nop\n\t"
> "movb %1, %b0\n\t"
> /* don't need lfence here, because loads are in-order
> */
> "jmp 1b\n"
> - "2:"
> - : "+Q" (inc), "+m" (lock->slock)
> - :
> + "2:",
> + ALTERNATIVE_TICKET_LOCK,
> + X86_FEATURE_SPINLOCK_YIELD,
> + ASM_OUTPUT2("+Q" (inc), "+m" (lock->slock)),
> + [stub] "i" (virt_spin_lock_stub)
> : "memory", "cc");
> }

Also, instead of obfuscating __ticket_spin_lock(), can't you provide a
whole alternative arch_spin_lock() implementation and switch between
that and whatever bat-shit these paravirt people come up with?

2010-06-30 09:00:26

[permalink] [raw]

Subject: Re: [PATCH 1/4, v2] x86: enlightenment for ticket spin locks - base implementation

>>> On 30.06.10 at 10:05, Peter Zijlstra <[email protected]> wrote:
> On Tue, 2010-06-29 at 15:31 +0100, Jan Beulich wrote:
>> Add optional (alternative instructions based) callout hooks to the
>> contended ticket lock and the ticket unlock paths, to allow hypervisor
>> specific code to be used for reducing/eliminating the bad effects
>> ticket locks have on performance when running virtualized.
>
> Uhm, I'd much rather see a single alternative implementation, not a
> per-hypervisor lock implementation.

How would you imaging this to work? I can't see how the mechanism
could be hypervisor agnostic. Just look at the Xen implementation
(patch 2) - do you really see room for meaningful abstraction there?
Not the least that not every hypervisor may even have a way to
poll for events (like Xen does), in which case a simple yield may be
needed instead.

>> For the moment, this isn't intended to be used together with pv-ops,
>> but this is just to simplify initial integration. The ultimate goal
>> for this should still be to replace pv-ops spinlocks.
>
> So why not start by removing that?

Because I wouldn't get around to test it within the time constraints
I have?

>> +config ENLIGHTEN_SPINLOCKS
>
> Why exactly are these enlightened? I'd say CONFIG_UNFAIR_SPINLOCKS would
> be much better.

The naming certainly isn't significant to me. If consensus can be
reached on any one name, I'll be fine with changing it. I just don't
want to play ping pong here.

>> +#define X86_FEATURE_SPINLOCK_YIELD (3*32+31) /* hypervisor yield interface
> */
>
> That name also sucks chunks, yield isn't a lock related term.

Not sure what's wrong with the name (the behavior *is* a yield of
some sort to the underlying scheduler). But again, any name
acceptable to all relevant parties will be fine with me.

>> +#define ALTERNATIVE_TICKET_LOCK \
>
> But but but, the alternative isn't a ticket lock..!?

??? Of course it is. Or do you mean the macro doesn't
represent the full lock operation? My reading of the name is that
this is the common alternative instruction sequence used in a lock
operation. And just as above - I don't care much about the actual
name, and I'll change any or all of them as long as I'm not going to
be asked to change them back and forth.

Jan

2010-06-30 09:12:12

[permalink] [raw]

Subject: Re: [PATCH 1/4, v2] x86: enlightenment for ticket spin locks - base implementation

On Wed, 2010-06-30 at 10:00 +0100, Jan Beulich wrote:
> >>> On 30.06.10 at 10:05, Peter Zijlstra <[email protected]> wrote:
> > On Tue, 2010-06-29 at 15:31 +0100, Jan Beulich wrote:
> >> Add optional (alternative instructions based) callout hooks to the
> >> contended ticket lock and the ticket unlock paths, to allow hypervisor
> >> specific code to be used for reducing/eliminating the bad effects
> >> ticket locks have on performance when running virtualized.
> >
> > Uhm, I'd much rather see a single alternative implementation, not a
> > per-hypervisor lock implementation.
>
> How would you imaging this to work? I can't see how the mechanism
> could be hypervisor agnostic. Just look at the Xen implementation
> (patch 2) - do you really see room for meaningful abstraction there?

I tried not to, it made my eyes bleed..

But from what I hear all virt people are suffering from spinlocks (and
fair spinlocks in particular), so I was thinking it'd be a good idea to
get all interested parties to collaborate on one. Fragmentation like
this hardly ever works out well.

> Not the least that not every hypervisor may even have a way to
> poll for events (like Xen does), in which case a simple yield may be
> needed instead.

No idea what you're talking about, I think you assume I actually know
something about Xen or virt..

> >> For the moment, this isn't intended to be used together with pv-ops,
> >> but this is just to simplify initial integration. The ultimate goal
> >> for this should still be to replace pv-ops spinlocks.
> >
> > So why not start by removing that?
>
> Because I wouldn't get around to test it within the time constraints
> I have?

I'd say that removing basically dead code (the paravirt spinlocks) the
code you'd be changing was easier to follow and thus your patches would
be done quicker?

> >> +#define ALTERNATIVE_TICKET_LOCK \
> >
> > But but but, the alternative isn't a ticket lock..!?
>
> ??? Of course it is.

Ah, right, after looking a bit more at patch 2 I see you indeed
implement a ticket like lock. Although why you need both a ticket and a
FIFO list is beyond me.

2010-06-30 09:26:47

[permalink] [raw]

Subject: Re: [PATCH 1/4, v2] x86: enlightenment for ticket spin locks - base implementation

On Wed, 2010-06-30 at 10:05 +0200, Peter Zijlstra wrote:
> > +config ENLIGHTEN_SPINLOCKS
>
> Why exactly are these enlightened?

Or did I just miss a terribly pun:
enlightenment -> Buddha -> Zen -> Xen ?

2010-06-30 09:33:29

by Gleb Natapov

[permalink] [raw]

Subject: Re: [PATCH 1/4, v2] x86: enlightenment for ticket spin locks - base implementation

On Wed, Jun 30, 2010 at 11:26:36AM +0200, Peter Zijlstra wrote:
> On Wed, 2010-06-30 at 10:05 +0200, Peter Zijlstra wrote:
> > > +config ENLIGHTEN_SPINLOCKS
> >
> > Why exactly are these enlightened?
>
> Or did I just miss a terribly pun:
> enlightenment -> Buddha -> Zen -> Xen ?
Enlightenment is also a term that MS uses to describe their PV guests
which makes this name very confusing in Xen context. When I saw this
patch series I thought it has something to do with Hyper-V.

--
Gleb.

2010-06-30 09:56:20

[permalink] [raw]

Subject: Re: [PATCH 1/4, v2] x86: enlightenment for ticket spin locks - base implementation

On 06/30/2010 11:11 AM, Peter Zijlstra wrote:
> On Wed, 2010-06-30 at 10:00 +0100, Jan Beulich wrote:
>
>>>>> On 30.06.10 at 10:05, Peter Zijlstra <[email protected]> wrote:
>>>>>
>>> On Tue, 2010-06-29 at 15:31 +0100, Jan Beulich wrote:
>>>
>>>> Add optional (alternative instructions based) callout hooks to the
>>>> contended ticket lock and the ticket unlock paths, to allow hypervisor
>>>> specific code to be used for reducing/eliminating the bad effects
>>>> ticket locks have on performance when running virtualized.
>>>>
>>> Uhm, I'd much rather see a single alternative implementation, not a
>>> per-hypervisor lock implementation.
>>>
>> How would you imaging this to work? I can't see how the mechanism
>> could be hypervisor agnostic. Just look at the Xen implementation
>> (patch 2) - do you really see room for meaningful abstraction there?
>>
> I tried not to, it made my eyes bleed..
>
> But from what I hear all virt people are suffering from spinlocks (and
> fair spinlocks in particular), so I was thinking it'd be a good idea to
> get all interested parties to collaborate on one. Fragmentation like
> this hardly ever works out well.
>

The fastpath of the spinlocks can be common, but if it ends up spinning
too long (however that might be defined), then it needs to call out to a
hypervisor-specific piece of code which is effectively "yield this vcpu
until its worth trying again". In Xen we can set up an event channel
that the waiting CPU can block on, and the current lock holder can
tickle it when it releases the lock (ideally it would just tickle the
CPU with the next ticket, but that's a further refinement).

I'm not sure what the corresponding implementation for KVM or HyperV
would look like. Modern Intel chips have a "do a VMEXIT if you've run
pause in a tight loop for too long" feature, which deals with the
"spinning too long" part, but I'm not sure about the blocking mechanism
(something based on monitor/mwait perhaps).

J

2010-06-30 10:50:47

[permalink] [raw]

Subject: Re: [PATCH 1/4, v2] x86: enlightenment for ticket spin locks - base implementation

On 06/30/2010 11:11 AM, Peter Zijlstra wrote:
>>> Uhm, I'd much rather see a single alternative implementation, not a
>>> per-hypervisor lock implementation.
>>>
>> How would you imaging this to work? I can't see how the mechanism
>> could be hypervisor agnostic. Just look at the Xen implementation
>> (patch 2) - do you really see room for meaningful abstraction there?
>>
> I tried not to, it made my eyes bleed..
>
> But from what I hear all virt people are suffering from spinlocks (and
> fair spinlocks in particular), so I was thinking it'd be a good idea to
> get all interested parties to collaborate on one. Fragmentation like
> this hardly ever works out well.
>

Yes. Now that I've looked at it a bit more closely I think these
patches put way too much logic into the per-hypervisor part of the code.

> Ah, right, after looking a bit more at patch 2 I see you indeed
> implement a ticket like lock. Although why you need both a ticket and a
> FIFO list is beyond me.
>

That appears to be a mechanism to allow it to take interrupts while
spinning on the lock, which is something that stock ticket locks don't
allow. If that's a useful thing to do, it should happen in the generic
ticketlock code rather than in the per-hypervisor backend (otherwise we
end up with all kinds of subtle differences in lock behaviour depending
on the exact environment, which is just going to be messy). Even if
interrupts-while-spinning isn't useful on native hardware, it is going
to be equally applicable to all virtual environments.

J

2010-06-30 11:43:20

[permalink] [raw]

Subject: Re: [PATCH 1/4, v2] x86: enlightenment for ticket spin locks - base implementation

>>> On 30.06.10 at 11:56, Jeremy Fitzhardinge <[email protected]> wrote:
> On 06/30/2010 11:11 AM, Peter Zijlstra wrote:
>> On Wed, 2010-06-30 at 10:00 +0100, Jan Beulich wrote:
>>
>>>>>> On 30.06.10 at 10:05, Peter Zijlstra <[email protected]> wrote:
>>>>>>
>>>> On Tue, 2010-06-29 at 15:31 +0100, Jan Beulich wrote:
>>>>
>>>>> Add optional (alternative instructions based) callout hooks to the
>>>>> contended ticket lock and the ticket unlock paths, to allow hypervisor
>>>>> specific code to be used for reducing/eliminating the bad effects
>>>>> ticket locks have on performance when running virtualized.
>>>>>
>>>> Uhm, I'd much rather see a single alternative implementation, not a
>>>> per-hypervisor lock implementation.
>>>>
>>> How would you imaging this to work? I can't see how the mechanism
>>> could be hypervisor agnostic. Just look at the Xen implementation
>>> (patch 2) - do you really see room for meaningful abstraction there?
>>>
>> I tried not to, it made my eyes bleed..
>>
>> But from what I hear all virt people are suffering from spinlocks (and
>> fair spinlocks in particular), so I was thinking it'd be a good idea to
>> get all interested parties to collaborate on one. Fragmentation like
>> this hardly ever works out well.
>>
>
> The fastpath of the spinlocks can be common, but if it ends up spinning
> too long (however that might be defined), then it needs to call out to a
> hypervisor-specific piece of code which is effectively "yield this vcpu
> until its worth trying again". In Xen we can set up an event channel
> that the waiting CPU can block on, and the current lock holder can
> tickle it when it releases the lock (ideally it would just tickle the
> CPU with the next ticket, but that's a further refinement).

It does tickle just the new owner - that's what the list is for.

Jan

2010-06-30 11:49:01

[permalink] [raw]

Subject: Re: [PATCH 1/4, v2] x86: enlightenment for ticket spin locks - base implementation

On Wed, 2010-06-30 at 12:43 +0100, Jan Beulich wrote:

> It does tickle just the new owner - that's what the list is for.

But if you have a FIFO list you don't need the ticket stuff and can
implement a FIFO lock instead.

2010-06-30 11:51:40

[permalink] [raw]

Subject: Re: [PATCH 1/4, v2] x86: enlightenment for ticket spin locks - base implementation

>>> On 30.06.10 at 12:50, Jeremy Fitzhardinge <[email protected]> wrote:
> On 06/30/2010 11:11 AM, Peter Zijlstra wrote:
>>>> Uhm, I'd much rather see a single alternative implementation, not a
>>>> per-hypervisor lock implementation.
>>>>
>>> How would you imaging this to work? I can't see how the mechanism
>>> could be hypervisor agnostic. Just look at the Xen implementation
>>> (patch 2) - do you really see room for meaningful abstraction there?
>>>
>> I tried not to, it made my eyes bleed..
>>
>> But from what I hear all virt people are suffering from spinlocks (and
>> fair spinlocks in particular), so I was thinking it'd be a good idea to
>> get all interested parties to collaborate on one. Fragmentation like
>> this hardly ever works out well.
>>
>
> Yes. Now that I've looked at it a bit more closely I think these
> patches put way too much logic into the per-hypervisor part of the code.

I fail to see that: Depending on the hypervisor's capabilities, the
two main functions could be much smaller (potentially there wouldn't
even be a need for the unlock hook in some cases), and hence I
continue to think that all code that is in xen.c indeed is non-generic
(while I won't say that there may not be a second hypervisor where
the code might look almost identical).

>> Ah, right, after looking a bit more at patch 2 I see you indeed
>> implement a ticket like lock. Although why you need both a ticket and a
>> FIFO list is beyond me.
>>
>
> That appears to be a mechanism to allow it to take interrupts while
> spinning on the lock, which is something that stock ticket locks don't
> allow. If that's a useful thing to do, it should happen in the generic
> ticketlock code rather than in the per-hypervisor backend (otherwise we
> end up with all kinds of subtle differences in lock behaviour depending
> on the exact environment, which is just going to be messy). Even if
> interrupts-while-spinning isn't useful on native hardware, it is going
> to be equally applicable to all virtual environments.

While we do interrupt re-enabling in our pv kernels, I intentionally
didn't do this here - it complicates the code quite a bit further, and
that did seem right for an initial submission.

The list really juts is needed to not pointlessly tickle CPUs that
won't own the just released lock next anyway (or would own
it, but meanwhile went for another one where they also decided
to go into polling mode).

Jan

2010-06-30 11:54:31

[permalink] [raw]

Subject: Re: [PATCH 1/4, v2] x86: enlightenment for ticket spin locks - base implementation

>>> On 30.06.10 at 13:48, Peter Zijlstra <[email protected]> wrote:
> On Wed, 2010-06-30 at 12:43 +0100, Jan Beulich wrote:
>
>> It does tickle just the new owner - that's what the list is for.
>
> But if you have a FIFO list you don't need the ticket stuff and can
> implement a FIFO lock instead.

The list is LIFO (not FIFO, as only the most recently added entry is
a candidate for needing wakeup as long as there's no interrupt
re-enabling in irqsave lock paths), and is only used for tickling (not
for deciding who's going to be the next owner).

Jan

2010-06-30 12:53:28

[permalink] [raw]

Subject: Re: [PATCH 1/4, v2] x86: enlightenment for ticket spin locks - base implementation

On 06/30/2010 01:52 PM, Jan Beulich wrote:
> I fail to see that: Depending on the hypervisor's capabilities, the
> two main functions could be much smaller (potentially there wouldn't
> even be a need for the unlock hook in some cases),

What mechanism are you envisaging in that case?

>> That appears to be a mechanism to allow it to take interrupts while
>> spinning on the lock, which is something that stock ticket locks don't
>> allow. If that's a useful thing to do, it should happen in the generic
>> ticketlock code rather than in the per-hypervisor backend (otherwise we
>> end up with all kinds of subtle differences in lock behaviour depending
>> on the exact environment, which is just going to be messy). Even if
>> interrupts-while-spinning isn't useful on native hardware, it is going
>> to be equally applicable to all virtual environments.
>>
> While we do interrupt re-enabling in our pv kernels, I intentionally
> didn't do this here - it complicates the code quite a bit further, and
> that did seem right for an initial submission.
>

Ah, I was confused by this:
> + /*
> + * If we interrupted another spinlock while it was blocking, make
> + * sure it doesn't block (again) without re-checking the lock.
> + */
> + if (spinning.prev)
> + sync_set_bit(percpu_read(poll_evtchn),
> + xen_shared_info->evtchn_pending);
> +
> +

> The list really juts is needed to not pointlessly tickle CPUs that
> won't own the just released lock next anyway (or would own
> it, but meanwhile went for another one where they also decided
> to go into polling mode).

Did you measure that it was a particularly common case which was worth
optimising for?

J

2010-06-30 13:21:09

[permalink] [raw]

Subject: Re: [PATCH 1/4, v2] x86: enlightenment for ticket spin locks - base implementation

>>> On 30.06.10 at 14:53, Jeremy Fitzhardinge <[email protected]> wrote:
> On 06/30/2010 01:52 PM, Jan Beulich wrote:
>> I fail to see that: Depending on the hypervisor's capabilities, the
>> two main functions could be much smaller (potentially there wouldn't
>> even be a need for the unlock hook in some cases),
>
> What mechanism are you envisaging in that case?

A simple yield is better than not doing anything at all.

>> The list really juts is needed to not pointlessly tickle CPUs that
>> won't own the just released lock next anyway (or would own
>> it, but meanwhile went for another one where they also decided
>> to go into polling mode).
>
> Did you measure that it was a particularly common case which was worth
> optimising for?

I didn't measure this particular case. But since the main problem
with ticket locks is when (host) CPUs are overcommitted, it
certainly is a bad idea to create even more load on the host than
there already is (the more that these are bursts).

Jan

2010-06-30 13:28:14