Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752177AbaKKTtV (ORCPT ); Tue, 11 Nov 2014 14:49:21 -0500 Received: from mga03.intel.com ([134.134.136.65]:29663 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751751AbaKKTtS (ORCPT ); Tue, 11 Nov 2014 14:49:18 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.07,362,1413270000"; d="scan'208";a="635254302" Message-ID: <1415735332.21508.1.camel@theros.lm.intel.com> Subject: Re: [PATCH 3/6] x86: Add support for the clwb instruction From: Ross Zwisler To: Borislav Petkov Cc: linux-kernel@vger.kernel.org, H Peter Anvin , Ingo Molnar , Thomas Gleixner , David Airlie , dri-devel@lists.freedesktop.org, x86@kernel.org Date: Tue, 11 Nov 2014 12:48:52 -0700 In-Reply-To: <20141111191239.GC31523@pd.tnic> References: <1415731396-19364-1-git-send-email-ross.zwisler@linux.intel.com> <1415731396-19364-4-git-send-email-ross.zwisler@linux.intel.com> <20141111191239.GC31523@pd.tnic> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4 (3.10.4-4.fc20.rez) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2014-11-11 at 20:12 +0100, Borislav Petkov wrote: > On Tue, Nov 11, 2014 at 11:43:13AM -0700, Ross Zwisler wrote: > > Add support for the new clwb instruction. This instruction was > > announced in the document "Intel Architecture Instruction Set Extensions > > Programming Reference" with reference number 319433-022. > > > > https://software.intel.com/sites/default/files/managed/0d/53/319433-022.pdf > > > > Here are some things of note: > > > > - As with the clflushopt patches before this, I'm assuming that the addressing > > mode generated by the original clflush instruction will match the new > > clflush instruction with the 0x66 prefix for clflushopt, and for the > > xsaveopt instruction with the 0x66 prefix for clwb. For all the test cases > > that I've come up with and for the new clwb code generated by this patch > > series, this has proven to be true on my test machine. > > > > - According to the SDM, xsaveopt has a form where it has a REX.W prefix. I > > believe that this prefix will not be generated by gcc in x86_64 kernel code. > > Based on this, I don't believe I need to account for this extra prefix when > > dealing with the assembly language created for clwb. Please correct me if > > I'm wrong. > > > > Signed-off-by: Ross Zwisler > > Cc: H Peter Anvin > > Cc: Ingo Molnar > > Cc: Thomas Gleixner > > Cc: David Airlie > > Cc: dri-devel@lists.freedesktop.org > > Cc: x86@kernel.org > > --- > > arch/x86/include/asm/cpufeature.h | 1 + > > arch/x86/include/asm/special_insns.h | 10 ++++++++++ > > 2 files changed, 11 insertions(+) > > > > diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h > > index b3e6b89..fbbed34 100644 > > --- a/arch/x86/include/asm/cpufeature.h > > +++ b/arch/x86/include/asm/cpufeature.h > > @@ -227,6 +227,7 @@ > > #define X86_FEATURE_SMAP ( 9*32+20) /* Supervisor Mode Access Prevention */ > > #define X86_FEATURE_PCOMMIT ( 9*32+22) /* PCOMMIT instruction */ > > #define X86_FEATURE_CLFLUSHOPT ( 9*32+23) /* CLFLUSHOPT instruction */ > > +#define X86_FEATURE_CLWB ( 9*32+24) /* CLWB instruction */ > > #define X86_FEATURE_AVX512PF ( 9*32+26) /* AVX-512 Prefetch */ > > #define X86_FEATURE_AVX512ER ( 9*32+27) /* AVX-512 Exponential and Reciprocal */ > > #define X86_FEATURE_AVX512CD ( 9*32+28) /* AVX-512 Conflict Detection */ > > diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h > > index 1709a2e..a328460 100644 > > --- a/arch/x86/include/asm/special_insns.h > > +++ b/arch/x86/include/asm/special_insns.h > > @@ -199,6 +199,16 @@ static inline void clflushopt(volatile void *__p) > > "+m" (*(volatile char __force *)__p)); > > } > > > > +static inline void clwb(volatile void *__p) > > +{ > > + alternative_io_2(".byte " __stringify(NOP_DS_PREFIX) "; clflush %P0", > > Any particular reason for using 0x3e as a prefix to have the insns be > the same size or is it simply because CLFLUSH can stomach it? > > :-) Essentially we need one additional byte at the beginning of the clflush so that we can flip it into a clflushopt by changing that byte into a 0x66 prefix. Two options are to either insert a 1 byte ASM_NOP1, or to add a 1 byte NOP_DS_PREFIX. Both have no functional effect with the plain clflush, but I've been told that executing a clflush + prefix should be faster than executing a clflush + NOP. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/