Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754329AbcLBTcV (ORCPT ); Fri, 2 Dec 2016 14:32:21 -0500 Received: from mail-ua0-f169.google.com ([209.85.217.169]:36441 "EHLO mail-ua0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751882AbcLBTcT (ORCPT ); Fri, 2 Dec 2016 14:32:19 -0500 MIME-Version: 1.0 In-Reply-To: References: <0a21157c2233ba7d0781bbf07866b3f2d7e7c25d.1480638597.git.luto@kernel.org> <20161202180343.gehqor7lgtmzwqq3@pd.tnic> <20161202185008.tdziqrzi4a3axord@pd.tnic> <20161202192050.l5l3rcwems6hptub@pd.tnic> From: Andy Lutomirski Date: Fri, 2 Dec 2016 11:30:23 -0800 Message-ID: Subject: Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core() implementation To: Linus Torvalds Cc: Borislav Petkov , Borislav Petkov , Andy Lutomirski , Peter Anvin , "the arch/x86 maintainers" , One Thousand Gnomes , "linux-kernel@vger.kernel.org" , Brian Gerst , Matthew Whitehead , Henrique de Moraes Holschuh , Peter Zijlstra , Andrew Cooper Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1436 Lines: 39 On Fri, Dec 2, 2016 at 11:24 AM, Linus Torvalds wrote: > On Fri, Dec 2, 2016 at 11:20 AM, Borislav Petkov wrote: >> >> Something like below? > > The optimize-nops thing needs it too, I think. > > Again, this will never matter in practice (even if somebody has a i486 > s till, the prefetch window size is like 16 bytes or something), but > from a documentation standpoint it's good. How's this? /* * This function forces the icache and prefetched instruction stream to * catch up with reality in two very specific cases: * * a) Text was modified using one virtual address and is about to be executed * from the same physical page at a different virtual address. * * b) Text was modified on a different CPU, may subsequently be * executed on this CPU, and you want to make sure the new version * gets executed. This generally means you're calling this in a IPI. * * If you're calling this for a different reason, you're probably doing * it wrong. */ static inline void native_sync_core(void) { ... } The body will do a MOV-to-CR2 followed by jmp 1f; 1:. This sequence should be guaranteed to flush the pipeline on any real CPU. On Xen it will do IRET-to-self. I suppose it could be an unconditional IRET-to-self, but that's a good deal slower and not a whole lot simpler. Although if we start doing it right, performance won't really matter here. --Andy