Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934100AbcLTJgH (ORCPT ); Tue, 20 Dec 2016 04:36:07 -0500 Received: from terminus.zytor.com ([198.137.202.10]:41680 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934039AbcLTJgC (ORCPT ); Tue, 20 Dec 2016 04:36:02 -0500 Date: Tue, 20 Dec 2016 01:35:08 -0800 From: tip-bot for Borislav Petkov Message-ID: Cc: linux-kernel@vger.kernel.org, hpa@zytor.com, torvalds@linux-foundation.org, peterz@infradead.org, tglx@linutronix.de, bp@suse.de, luto@amacapital.net, gnomes@lxorguk.ukuu.org.uk, luto@kernel.org, hmh@hmh.eng.br, brgerst@gmail.com, mingo@kernel.org, tedheadster@gmail.com, andrew.cooper3@citrix.com Reply-To: torvalds@linux-foundation.org, hpa@zytor.com, luto@amacapital.net, bp@suse.de, tglx@linutronix.de, peterz@infradead.org, linux-kernel@vger.kernel.org, andrew.cooper3@citrix.com, luto@kernel.org, gnomes@lxorguk.ukuu.org.uk, tedheadster@gmail.com, mingo@kernel.org, brgerst@gmail.com, hmh@hmh.eng.br In-Reply-To: <20161203150258.vwr5zzco7ctgc4pe@pd.tnic> References: <20161203150258.vwr5zzco7ctgc4pe@pd.tnic> To: linux-tip-commits@vger.kernel.org Subject: [tip:x86/urgent] x86/alternatives: Do not use sync_core() to serialize I$ Git-Commit-ID: 34bfab0eaf0fb5c6fb14c6b4013b06cdc7984466 X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3483 Lines: 86 Commit-ID: 34bfab0eaf0fb5c6fb14c6b4013b06cdc7984466 Gitweb: http://git.kernel.org/tip/34bfab0eaf0fb5c6fb14c6b4013b06cdc7984466 Author: Borislav Petkov AuthorDate: Sat, 3 Dec 2016 16:02:58 +0100 Committer: Ingo Molnar CommitDate: Tue, 20 Dec 2016 09:36:42 +0100 x86/alternatives: Do not use sync_core() to serialize I$ We use sync_core() in the alternatives code to stop speculative execution of prefetched instructions because we are potentially changing them and don't want to execute stale bytes. What it does on most machines is call CPUID which is a serializing instruction. And that's expensive. However, the instruction cache is serialized when we're on the local CPU and are changing the data through the same virtual address. So then, we don't need the serializing CPUID but a simple control flow change. Last being accomplished with a CALL/RET which the noinline causes. Suggested-by: Linus Torvalds Signed-off-by: Borislav Petkov Reviewed-by: Andy Lutomirski Cc: Andrew Cooper Cc: Andy Lutomirski Cc: Brian Gerst Cc: Henrique de Moraes Holschuh Cc: Matthew Whitehead Cc: One Thousand Gnomes Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20161203150258.vwr5zzco7ctgc4pe@pd.tnic Signed-off-by: Ingo Molnar --- arch/x86/kernel/alternative.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 5cb272a..c5b8f76 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -337,7 +337,11 @@ done: n_dspl, (unsigned long)orig_insn + n_dspl + repl_len); } -static void __init_or_module optimize_nops(struct alt_instr *a, u8 *instr) +/* + * "noinline" to cause control flow change and thus invalidate I$ and + * cause refetch after modification. + */ +static void __init_or_module noinline optimize_nops(struct alt_instr *a, u8 *instr) { unsigned long flags; @@ -346,7 +350,6 @@ static void __init_or_module optimize_nops(struct alt_instr *a, u8 *instr) local_irq_save(flags); add_nops(instr + (a->instrlen - a->padlen), a->padlen); - sync_core(); local_irq_restore(flags); DUMP_BYTES(instr, a->instrlen, "%p: [%d:%d) optimized NOPs: ", @@ -359,9 +362,12 @@ static void __init_or_module optimize_nops(struct alt_instr *a, u8 *instr) * This implies that asymmetric systems where APs have less capabilities than * the boot processor are not handled. Tough. Make sure you disable such * features by hand. + * + * Marked "noinline" to cause control flow change and thus insn cache + * to refetch changed I$ lines. */ -void __init_or_module apply_alternatives(struct alt_instr *start, - struct alt_instr *end) +void __init_or_module noinline apply_alternatives(struct alt_instr *start, + struct alt_instr *end) { struct alt_instr *a; u8 *instr, *replacement; @@ -667,7 +673,6 @@ void *__init_or_module text_poke_early(void *addr, const void *opcode, unsigned long flags; local_irq_save(flags); memcpy(addr, opcode, len); - sync_core(); local_irq_restore(flags); /* Could also do a CLFLUSH here to speed up CPU recovery; but that causes hangs on some VIA CPUs. */