Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933615AbcLOR3Q (ORCPT ); Thu, 15 Dec 2016 12:29:16 -0500 Received: from mail-ua0-f177.google.com ([209.85.217.177]:35035 "EHLO mail-ua0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932359AbcLOR3O (ORCPT ); Thu, 15 Dec 2016 12:29:14 -0500 MIME-Version: 1.0 In-Reply-To: <20161215164240.813682510@linutronix.de> References: <20161215162648.061449202@linutronix.de> <20161215164240.813682510@linutronix.de> From: Andy Lutomirski Date: Thu, 15 Dec 2016 09:28:52 -0800 Message-ID: Subject: Re: [patch 2/3] x86/process: Optimize TIF_BLOCKSTEP switch To: Thomas Gleixner Cc: LKML , X86 ML , Peter Zijlstra , Kyle Huey , Andy Lutomirski Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1668 Lines: 53 On Thu, Dec 15, 2016 at 8:44 AM, Thomas Gleixner wrote: > Provide and use a seperate helper for toggling the DEBUGCTLMSR_BTF bit > instead of doing it open coded with a branch and eventually evaluating > boot_cpu_data twice. > > x86_64: > 3694 8505 16 12215 2fb7 Before > 3662 8505 16 12183 2f97 After > > i386: > 5986 9388 1804 17178 431a Before > 5906 9388 1804 17098 42ca After > > Signed-off-by: Thomas Gleixner > --- > arch/x86/include/asm/processor.h | 12 ++++++++++++ > arch/x86/kernel/process.c | 10 ++-------- > 2 files changed, 14 insertions(+), 8 deletions(-) > > --- a/arch/x86/include/asm/processor.h > +++ b/arch/x86/include/asm/processor.h > @@ -676,6 +676,18 @@ static inline void update_debugctlmsr(un > wrmsrl(MSR_IA32_DEBUGCTLMSR, debugctlmsr); > } > > +static inline void toggle_debugctlmsr(unsigned long mask) > +{ > + unsigned long msrval; > + > +#ifndef CONFIG_X86_DEBUGCTLMSR > + if (boot_cpu_data.x86 < 6) > + return; > +#endif > + rdmsrl(MSR_IA32_DEBUGCTLMSR, msrval); > + wrmsrl(MSR_IA32_DEBUGCTLMSR, msrval ^ mask); > +} > + This scares me. If the MSR ever gets out of sync with the TI flag, this will malfunction. And IIRC the MSR is highly magical and the CPU clears it all by itself under a variety of not-so-well documented circumstances. How about adding a real feature bit and doing: if (!static_cpu_has(X86_FEATURE_BLOCKSTEP)) return; rdmsrl(MSR_IA32_DEBUGCTLMSR, msrval); msrval &= DEBUGCTLMSR_BTF; msrval |= (tifn >> TIF_BLOCKSTEP) << DEBUGCTLMSR_BIT; --Andy