Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp258952ybb; Thu, 19 Mar 2020 21:24:48 -0700 (PDT) X-Google-Smtp-Source: ADFU+vu7Dkpb35NH0JQ5sE1aMyWiEDlp68P8bT7VNcgKuBZUcpP7z4EdG+JfcxjrYSuFYCmxUvVp X-Received: by 2002:a9d:5e06:: with SMTP id d6mr5339877oti.311.1584678287909; Thu, 19 Mar 2020 21:24:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584678287; cv=none; d=google.com; s=arc-20160816; b=Xebd09Ff1YOVYahX130x8pbmdSNmUTfLdRApudlVZ1m8w4aifBdBpGLMd9M77msG/T +s6AZ88S21fIM5BqpWIYjQhdo9fKpb6okcU76s6viaCn3gKr/QpiBN8nd6zR8pPH1Vd8 MgVrkb3+tsXBfB2OcR1uABA0xV+8LSxq4fIX0QzODTYvPTKOewZjJdRniLJuKQn++oXG HUUfIwf7WGLfS5mbhyc58te8dK8rBWNZIhWxSGpkgeR8kQHyOXq8tfzk16WyVqkSBYEz LgtoOsnwFnwpbuol3ecWMYGzC0eNwgAqud9eEJjN8Z2P59pcN3bVhNBAIor9jOih6T4N ZFFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=NSjEoiHXHJADDgPWCodT+/YBBexTOrqOdTc2VghM02M=; b=LTSCGwwYDHBv1p6eeyddRQVnOFBn5f6u1BBFwgTKcSmPjT6cPA6p+Ky8XT2x0btKUV ymuYujumNepg3XYTM8qexM+doem4vk+Cu115J4pTOSS4cXge0oDjX5Ji4osUIOqxYoqV EfphcLMxfYlEmFzr1JjZoKt1HiRoi7EJp0WhJ5q0JCsuiZgS1IPDV+irBP/I07ooYW1U 3RGA0lTwTJj0j7VWexec/14+eNVZvcg5O0SjuE+3KfG+XpV8B8ukkHHU/O1xm7kMSEhi DMCA3qADp6Qjm7l3SNNxvwUXNCmjPeLeTFmBRKgraOy6X29KsMB8qYqxA5wNYcE9c+dW UFIQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=SQ5BRfBd; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v7si2263127otp.43.2020.03.19.21.24.35; Thu, 19 Mar 2020 21:24:47 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=SQ5BRfBd; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727197AbgCTEXa (ORCPT + 99 others); Fri, 20 Mar 2020 00:23:30 -0400 Received: from mail.kernel.org ([198.145.29.99]:51490 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727113AbgCTEX3 (ORCPT ); Fri, 20 Mar 2020 00:23:29 -0400 Received: from mail-wm1-f47.google.com (mail-wm1-f47.google.com [209.85.128.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id AC26E20777 for ; Fri, 20 Mar 2020 04:23:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1584678209; bh=RwUpts3my+2j4S9FwhIam1+GYrL1NFcEKY29ijsR0f0=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=SQ5BRfBdjGnHWHEOv22TTrCx+hiIZnXlTkt0193blhy5p+OqMwsX6LuafM4HxXK3v nGVfo+rQRRQspZEs8dtCKCugi1IFzKNAm7KeMLAlcex9N1PEWrGstL9Ah9SUilM3sG 4WT0MqPoSywO8k8Llp8+4lZkW6k++GNv34pdaVAQ= Received: by mail-wm1-f47.google.com with SMTP id 6so5105087wmi.5 for ; Thu, 19 Mar 2020 21:23:28 -0700 (PDT) X-Gm-Message-State: ANhLgQ39qMUANDoRk0kj2Cu4Dsz8ExJBBUvAmVRjnOJSXa9fqYoXT0jf 6/f32wTfWWxOwIz40O3hB40a4KZ/jq74arwyVGUjBA== X-Received: by 2002:a7b:c8ce:: with SMTP id f14mr7663108wml.138.1584678206958; Thu, 19 Mar 2020 21:23:26 -0700 (PDT) MIME-Version: 1.0 References: <1584677604-32707-1-git-send-email-kyung.min.park@intel.com> <1584677604-32707-3-git-send-email-kyung.min.park@intel.com> In-Reply-To: <1584677604-32707-3-git-send-email-kyung.min.park@intel.com> From: Andy Lutomirski Date: Thu, 19 Mar 2020 21:23:15 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v2 2/2] x86/delay: Introduce TPAUSE delay To: Kyung Min Park Cc: X86 ML , LKML , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Greg KH , Andi Kleen , Tony Luck , "Raj, Ashok" , "Ravi V. Shankar" , Fenghua Yu Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 19, 2020 at 9:13 PM Kyung Min Park wrote: > > TPAUSE instructs the processor to enter an implementation-dependent > optimized state. The instruction execution wakes up when the time-stamp > counter reaches or exceeds the implicit EDX:EAX 64-bit input value. > The instruction execution also wakes up due to the expiration of > the operating system time-limit or by an external interrupt > or exceptions such as a debug exception or a machine check exception. > > TPAUSE offers a choice of two lower power states: > 1. Light-weight power/performance optimized state C0.1 > 2. Improved power/performance optimized state C0.2 > This way, it can save power with low wake-up latency in comparison to > spinloop based delay. The selection between the two is governed by the > input register. > > TPAUSE is available on processors with X86_FEATURE_WAITPKG. > > Reviewed-by: Tony Luck > Co-developed-by: Fenghua Yu > Signed-off-by: Fenghua Yu > Signed-off-by: Kyung Min Park > --- > arch/x86/include/asm/mwait.h | 17 +++++++++++++++++ > arch/x86/lib/delay.c | 27 ++++++++++++++++++++++++++- > 2 files changed, 43 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h > index aaf6643..fd59db0 100644 > --- a/arch/x86/include/asm/mwait.h > +++ b/arch/x86/include/asm/mwait.h > @@ -22,6 +22,8 @@ > #define MWAITX_ECX_TIMER_ENABLE BIT(1) > #define MWAITX_MAX_WAIT_CYCLES UINT_MAX > #define MWAITX_DISABLE_CSTATES 0xf0 > +#define TPAUSE_C01_STATE 1 > +#define TPAUSE_C02_STATE 0 > > static inline void __monitor(const void *eax, unsigned long ecx, > unsigned long edx) > @@ -120,4 +122,19 @@ static inline void mwait_idle_with_hints(unsigned long eax, unsigned long ecx) > current_clr_polling(); > } > > +/* > + * Caller can specify whether to enter C0.1 (low latency, less > + * power saving) or C0.2 state (saves more power, but longer wakeup > + * latency). This may be overridden by the IA32_UMWAIT_CONTROL MSR > + * which can force requests for C0.2 to be downgraded to C0.1. > + */ > +static inline void __tpause(unsigned int ecx, unsigned int edx, > + unsigned int eax) > +{ > + /* "tpause %ecx, %edx, %eax;" */ > + asm volatile(".byte 0x66, 0x0f, 0xae, 0xf1\t\n" > + : > + : "c"(ecx), "d"(edx), "a"(eax)); > +} > + > #endif /* _ASM_X86_MWAIT_H */ > diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c > index e6db855..5f11f0a 100644 > --- a/arch/x86/lib/delay.c > +++ b/arch/x86/lib/delay.c > @@ -97,6 +97,27 @@ static void delay_tsc(u64 cycles) > } > > /* > + * On Intel the TPAUSE instruction waits until any of: > + * 1) the TSC counter exceeds the value provided in EAX:EDX > + * 2) global timeout in IA32_UMWAIT_CONTROL is exceeded > + * 3) an external interrupt occurs > + */ > +static void delay_halt_tpause(u64 start, u64 cycles) > +{ > + u64 until = start + cycles; > + unsigned int eax, edx; > + > + eax = (unsigned int)(until & 0xffffffff); > + edx = (unsigned int)(until >> 32); > + > + /* > + * Hard code the deeper (C0.2) sleep state because exit latency is > + * small compared to the "microseconds" that usleep() will delay. > + */ > + __tpause(TPAUSE_C02_STATE, edx, eax); > +} > + > +/* > * On some AMD platforms, MWAITX has a configurable 32-bit timer, that > * counts with TSC frequency. The input value is the number of TSC cycles > * to wait. MWAITX will also exit when the timer expires. > @@ -152,8 +173,12 @@ static void delay_halt(u64 __cycles) > > void use_tsc_delay(void) > { > - if (delay_fn == delay_loop) > + if (static_cpu_has(X86_FEATURE_WAITPKG)) { > + delay_halt_fn = delay_halt_tpause; > + delay_fn = delay_halt; > + } else if (delay_fn == delay_loop) { > delay_fn = delay_tsc; > + } > } This is an odd way to dispatch: you're using static_cpu_has(), but you're using it once to populate a function pointer. Why not just put the static_cpu_has() directly into delay_halt() and open-code the three variants? That will also make it a lot easier to understand the oddity with start and cycles. --Andy > -- > 2.7.4 >