Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753187AbcLBUle (ORCPT ); Fri, 2 Dec 2016 15:41:34 -0500 Received: from mail-ua0-f174.google.com ([209.85.217.174]:34994 "EHLO mail-ua0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751075AbcLBUlc (ORCPT ); Fri, 2 Dec 2016 15:41:32 -0500 MIME-Version: 1.0 In-Reply-To: References: <0a21157c2233ba7d0781bbf07866b3f2d7e7c25d.1480638597.git.luto@kernel.org> <20161202180343.gehqor7lgtmzwqq3@pd.tnic> <20161202185008.tdziqrzi4a3axord@pd.tnic> <20161202192050.l5l3rcwems6hptub@pd.tnic> From: Andy Lutomirski Date: Fri, 2 Dec 2016 12:41:11 -0800 Message-ID: Subject: Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core() implementation To: Linus Torvalds Cc: Borislav Petkov , Borislav Petkov , Andy Lutomirski , Peter Anvin , "the arch/x86 maintainers" , One Thousand Gnomes , "linux-kernel@vger.kernel.org" , Brian Gerst , Matthew Whitehead , Henrique de Moraes Holschuh , Peter Zijlstra , Andrew Cooper Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1978 Lines: 48 On Fri, Dec 2, 2016 at 11:35 AM, Linus Torvalds wrote: > On Fri, Dec 2, 2016 at 11:30 AM, Andy Lutomirski wrote: >> >> How's this? > > Looks ok. I do think that > >> I suppose it could be an unconditional IRET-to-self, but that's a good >> deal slower and not a whole lot simpler. Although if we start doing >> it right, performance won't really matter here. > > Considering you already got the iret-to-self wrong in the first > version, I really like the "one single unconditional version" so that > everybody tests that _one_ thing and there isn't anything subtle going > on. > > Hmm? Okay, sold. It makes the patchset much much shorter, too. > > And yes, if it turns out that performance matters, we almost certainly > are doing something really wrong, and we shouldn't be using that > sync_core() thing in that place anyway. To play devil's advocate (and definitely out of scope for this particular patchset), is user code permitted to do: 1. Map a page RX at one address and RW somewhere else (for nice ASLR). 2. Write to the RW mapping. 3. CPUID or IRET-to-self. 4. Jump to the RX mapping. Because, if so, we should maybe serialize whenever we migrate a process to a different CPU. (We *definitely* need to flush the store buffer when migrating, because the x86 architecture makes some memory ordering promises that get broken if a store from a thread stays in the store buffer of a different CPU when the thread gets migrated.) And if we're going to start serializing when migrating a thread, then we actually care about performance, in which case we should optimize the crap out of this thing, which probably means using MFENCE on AMD CPUs (AMD promises that MFENCE flushes the pipeline. Intel seems to be confused as to exactly what effect MFENCE has, or at least I'm confused as to what Intel thinks MFENCE does.) And we should make sure that we only do the extra flush when we don't switch mms. --Andy