Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753276AbcLBRcs (ORCPT ); Fri, 2 Dec 2016 12:32:48 -0500 Received: from mail-io0-f195.google.com ([209.85.223.195]:33459 "EHLO mail-io0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750745AbcLBRcq (ORCPT ); Fri, 2 Dec 2016 12:32:46 -0500 MIME-Version: 1.0 In-Reply-To: <0a21157c2233ba7d0781bbf07866b3f2d7e7c25d.1480638597.git.luto@kernel.org> References: <0a21157c2233ba7d0781bbf07866b3f2d7e7c25d.1480638597.git.luto@kernel.org> From: Linus Torvalds Date: Fri, 2 Dec 2016 09:32:45 -0800 X-Google-Sender-Auth: BdD-tYmDX7CX8lXs9mwywM8epco Message-ID: Subject: Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core() implementation To: Andy Lutomirski , Peter Anvin Cc: "the arch/x86 maintainers" , One Thousand Gnomes , Borislav Petkov , "linux-kernel@vger.kernel.org" , Brian Gerst , Matthew Whitehead , Henrique de Moraes Holschuh , Peter Zijlstra , Andrew Cooper Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1482 Lines: 35 On Thu, Dec 1, 2016 at 4:35 PM, Andy Lutomirski wrote: > > On my laptop, CPUID(eax=1, ecx=0) is ~83ns and IRET-to-self is > ~110ns. But Xen PV will trap CPUID if possible, so IRET-to-self > should end up being a nice speedup. So if we care deeply about the performance of this, we should really ask ourselves how much we need this... There are *very* few places where we really need to do a full serializing instruction, and I'd worry that we really don't need it in many of the places we do this. The only real case I'm aware of is modifying code that is modified through a different linear address than it's executed. Is there anything else where we _really_ need this sync-core thing? Sure, the microcode loader looks fine, but that doesn't look particularly performance-critical either. So I'd like to know which sync_core is actually so performance-critical that w e care about it, and then I'd like to understand why it's needed at all, because I suspect a number of them has been added with the model of "sprinkle random things around and hope". Looking at ftrace, for example, which is one of the users, does it actually _really_ need sync_core() at all? It seems to use the kerrnel virtual address, and then the instruction stream will be coherent. Adding Peter Anvin to the participants list, because iirc he was the one who really talked to hardwre engineers about the synchronization issues with serializing kernel code. Linus