Message-Id: <584688FD02000078001257BB@prv-mh.provo.novell.com>
Date: Tue, 06 Dec 2016 01:46:37 -0700
From: "Jan Beulich" <JBeulich@suse.com>
To: "Andy Lutomirski" <luto@kernel.org>
Cc: "Borislav Petkov" <bp@alien8.de>,
        "Andrew Cooper" <andrew.cooper3@citrix.com>,
        "Brian Gerst" <brgerst@gmail.com>,
        "Matthew Whitehead" <tedheadster@gmail.com>,
        "Henrique de Moraes Holschuh" <hmh@hmh.eng.br>,
        "Peter Zijlstra" <peterz@infradead.org>, <x86@kernel.org>,
        "xen-devel" <Xen-devel@lists.xen.org>,
        "One Thousand Gnomes" <gnomes@lxorguk.ukuu.org.uk>,
        "Boris Ostrovsky" <boris.ostrovsky@oracle.com>,
        "Juergen Gross" <JGross@suse.com>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        "H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [Xen-devel] [PATCH v3 4/4] x86/asm: Rewrite sync_core() to
 use IRET-to-self
References: <cover.1480973468.git.luto@kernel.org><cover.1480973468.git.luto@kernel.org>
 <c452beb3aaa5a0232b5ef27f443f0c27b6c65fe2.1480973468.git.luto@kernel.org>
In-Reply-To: <c452beb3aaa5a0232b5ef27f443f0c27b6c65fe2.1480973468.git.luto@kernel.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 8BIT
Content-Disposition: inline
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2576
Lines: 82

>>> On 05.12.16 at 22:32, <luto@kernel.org> wrote:
>  static inline void sync_core(void)
>  {
> -	int tmp;
> -
> -#ifdef CONFIG_X86_32
>  	/*
> -	 * Do a CPUID if available, otherwise do a jump.  The jump
> -	 * can conveniently enough be the jump around CPUID.
> +	 * There are quite a few ways to do this.  IRET-to-self is nice
> +	 * because it works on every CPU, at any CPL (so it's compatible
> +	 * with paravirtualization), and it never exits to a hypervisor.
> +	 * The only down sides are that it's a bit slow (it seems to be
> +	 * a bit more than 2x slower than the fastest options) and that
> +	 * it unmasks NMIs.  The "push %cs" is needed because, in
> +	 * paravirtual environments, __KERNEL_CS may not be a valid CS
> +	 * value when we do IRET directly.
> +	 *
> +	 * In case NMI unmasking or performance every becomes a problem,
> +	 * the next best option appears to be MOV-to-CR2 and an
> +	 * unconditional jump.  That sequence also works on all CPUs,
> +	 * but it will fault at CPL3.

CPL > 0 I think.

> +	 * CPUID is the conventional way, but it's nasty: it doesn't
> +	 * exist on some 486-like CPUs, and it usually exits to a
> +	 * hypervisor.
>  	 */
> -	asm volatile("cmpl %2,%1\n\t"
> -		     "jl 1f\n\t"
> -		     "cpuid\n"
> -		     "1:"
> -		     : "=a" (tmp)
> -		     : "rm" (boot_cpu_data.cpuid_level), "ri" (0), "0" (1)
> -		     : "ebx", "ecx", "edx", "memory");
> +	register void *__sp asm(_ASM_SP);
> +
> +#ifdef CONFIG_X86_32
> +	asm volatile (
> +		"pushfl\n\t"
> +		"pushl %%cs\n\t"
> +		"pushl $1f\n\t"
> +		"iret\n\t"
> +		"1:"
> +		: "+r" (__sp) : : "cc", "memory");

I don't thing EFLAGS (i.e. "cc") gets modified anywhere here. And
the memory clobber would perhaps better be pulled out into an
explicit barrier() invocation (making it more obvious what it's needed
for)?

>  #else
> -	/*
> -	 * CPUID is a barrier to speculative execution.
> -	 * Prefetched instructions are automatically
> -	 * invalidated when modified.
> -	 */
> -	asm volatile("cpuid"
> -		     : "=a" (tmp)
> -		     : "0" (1)
> -		     : "ebx", "ecx", "edx", "memory");
> +	unsigned long tmp;
> +
> +	asm volatile (
> +		"movq %%ss, %0\n\t"
> +		"pushq %0\n\t"
> +		"pushq %%rsp\n\t"
> +		"addq $8, (%%rsp)\n\t"
> +		"pushfq\n\t"
> +		"movq %%cs, %0\n\t"
> +		"pushq %0\n\t"
> +		"pushq $1f\n\t"
> +		"iretq\n\t"
> +		"1:"
> +		: "=r" (tmp), "+r" (__sp) : : "cc", "memory");

The first output needs to be "=&r". And is movq really a good
idea for selector reads? Why don't you make tmp unsigned int,
use plain mov, and use %q0 as pushq's operands?

Jan