Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp2956335ybi; Thu, 18 Jul 2019 18:04:45 -0700 (PDT) X-Google-Smtp-Source: APXvYqz5O5ftrGbP9lJBnO0qeg+WXOmU/Z5LNGy9fadTiET8S03XAh9QxRFsp5mYsVs+IxNuuj/c X-Received: by 2002:a17:902:467:: with SMTP id 94mr53018064ple.131.1563498285225; Thu, 18 Jul 2019 18:04:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1563498285; cv=none; d=google.com; s=arc-20160816; b=Mrl3nO2R8w/ncXO3BRM1GmwUvoVhBsR50deIB7NfXh5g0R28I983n1R9gvE+MU9QGs stoIEUn42zNQphP/vW3oVFQFLXP2UATsMd8u1u7s+MwAhAD5IK6uzE6e1LWQVLJSNjk6 R3fSgf1AiVaaSKlrujT0fEIrVUI0oCunqVC0rlmXSLH/A172AFBfAskdNpH5dHAxRHVR XyJ25BweB8yqGe9RI7SaiSugcjhp3AVg1oQe4wVimZCjI2yc0TNCAwGd2P1JkB6/VrNA MOEkpV2HlaPXTt4JwB/XOMnrG+ywloLUwNzUKg9/sJq/WRjhOGpCa7Qidz8pCDY0c8Ul Y7LA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from; bh=AfEn5k4rF+4iyOCNDK+njemPaPAtLOhaIH1NFhEtQDU=; b=LmcwjYwGBRtkewrhrWHtraz+fBISk6ZlftbDQp4B7ZEmN5L2AJ1peH5pqVddtY1rWF hKg6Jh+x+qrp/4DUYf9LVeXcAyyse12c44aHiLKwsMUomLfjOTlLfLfu/DOtuC1GRG3H iGoDy1GIH8Xy7HjsdSwfncJ1ZP2OXO6gKfAMGn6A27Q2cF5Kzs8HrUKBrFvetjqEwh2v CnEgSqxzoD3U9RLvKk5xsFWyXvHWrye0JQwBvymdzZOBeSbuoqt+XOpedCqHgg7437+9 q67wy+aPctXtGEH/1qpmMlFnXlv7nHpovYOz2O4tRrJVp5AW3CANbquLrnBy6XXbY/hY 4ZBQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=NONE dis=NONE) header.from=vmware.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g191si847533pgc.331.2019.07.18.18.04.29; Thu, 18 Jul 2019 18:04:45 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=NONE dis=NONE) header.from=vmware.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726728AbfGSBEC (ORCPT + 99 others); Thu, 18 Jul 2019 21:04:02 -0400 Received: from mail-pl1-f196.google.com ([209.85.214.196]:40171 "EHLO mail-pl1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726645AbfGSBD7 (ORCPT ); Thu, 18 Jul 2019 21:03:59 -0400 Received: by mail-pl1-f196.google.com with SMTP id a93so14718379pla.7 for ; Thu, 18 Jul 2019 18:03:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=AfEn5k4rF+4iyOCNDK+njemPaPAtLOhaIH1NFhEtQDU=; b=OWZU34Z10dfkKPVRMCR09vXrHEVJy64/m6t9wpT/Z37/d6xD6xhOS9Rp/Cflo2Ic1J ra+XakdWkWRLLH5qiSOLZP5ZdMoLkWw3NILb/aIfisL/Sde1Z2Sj3IjfTVvFXz1SGT/2 xyk0QakcrsNHbr46BhudRSIpVBjPFggjn4T9Oix/ICoeWstUn26u/nwQqgTtnYYW05a8 5g2c4tf6OyaM5TpSDQNtggACNUNxSKsDTeRazu3R6KwVLZ37DAGPhzPTm0PYnOy+wWni GLH0hgg7vqifp9Y1yMfPLXcy5q1L668j/yscvtuf3igpT8wliWzy/q5ffZVgG1KF11MU Qghw== X-Gm-Message-State: APjAAAVBRuNo/4TikxQA5fC7fZh/GvONJV8rKzPG01QUQWoun596T4fa pJKSosJFaArW7C8diA2U3nI= X-Received: by 2002:a17:902:be15:: with SMTP id r21mr52291590pls.293.1563498238781; Thu, 18 Jul 2019 18:03:58 -0700 (PDT) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id q144sm28887612pfc.103.2019.07.18.18.03.57 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Thu, 18 Jul 2019 18:03:58 -0700 (PDT) From: Nadav Amit To: Peter Zijlstra Cc: Andy Lutomirski , x86@kernel.org, linux-kernel@vger.kernel.org, Dave Hansen , Thomas Gleixner , Ingo Molnar , Nadav Amit Subject: [RFC 7/7] x86/current: Aggressive caching of current Date: Thu, 18 Jul 2019 10:41:10 -0700 Message-Id: <20190718174110.4635-8-namit@vmware.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190718174110.4635-1-namit@vmware.com> References: <20190718174110.4635-1-namit@vmware.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The current_task is supposed to be constant in each thread and therefore does not need to be reread. There is already an attempt to cache it using inline assembly, using this_cpu_read_stable(), which hides the dependency on the read memory address. However, this caching is not working very well. For example, sync_mm_rss() still reads current_task twice for no reason. Allow more aggressive caching by aliasing current_task to into a constant const_current_task and reading from the constant copy. Doing so requires the compiler to support x86 segment qualifiers. Hide const_current_task in a different compilation unit to avoid the compiler from assuming that the value is constant during compilation. Signed-off-by: Nadav Amit --- arch/x86/include/asm/current.h | 30 ++++++++++++++++++++++++++++++ arch/x86/kernel/cpu/Makefile | 1 + arch/x86/kernel/cpu/common.c | 7 +------ arch/x86/kernel/cpu/current.c | 16 ++++++++++++++++ include/linux/compiler.h | 2 +- 5 files changed, 49 insertions(+), 7 deletions(-) create mode 100644 arch/x86/kernel/cpu/current.c diff --git a/arch/x86/include/asm/current.h b/arch/x86/include/asm/current.h index 3e204e6140b5..7f093e81a647 100644 --- a/arch/x86/include/asm/current.h +++ b/arch/x86/include/asm/current.h @@ -10,11 +10,41 @@ struct task_struct; DECLARE_PER_CPU(struct task_struct *, current_task); +#if USE_X86_SEG_SUPPORT + +/* + * Hold a constant alias for current_task, which would allow to avoid caching of + * current task. + * + * We must mark const_current_task with the segment qualifiers, as otherwise gcc + * would do redundant reads of const_current_task. + */ +DECLARE_PER_CPU(struct task_struct * const __percpu_seg_override, const_current_task); + +static __always_inline struct task_struct *get_current(void) +{ + + /* + * GCC is missing functionality of removing segment qualifiers, which + * messes with per-cpu infrastructure that holds local copies. Use + * __raw_cpu_read to avoid holding any copy. + */ + return __raw_cpu_read(, const_current_task); +} + +#else /* USE_X86_SEG_SUPPORT */ + +/* + * Without segment qualifier support, the per-cpu infrastrucutre is not + * suitable for reading constants, so use this_cpu_read_stable() in this case. + */ static __always_inline struct task_struct *get_current(void) { return this_cpu_read_stable(current_task); } +#endif /* USE_X86_SEG_SUPPORT */ + #define current get_current() #endif /* __ASSEMBLY__ */ diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile index d7a1e5a9331c..d816f03a37d7 100644 --- a/arch/x86/kernel/cpu/Makefile +++ b/arch/x86/kernel/cpu/Makefile @@ -19,6 +19,7 @@ CFLAGS_common.o := $(nostackp) obj-y := cacheinfo.o scattered.o topology.o obj-y += common.o +obj-y += current.o obj-y += rdrand.o obj-y += match.o obj-y += bugs.o diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index e0489d2860d3..33a6b51e8059 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1607,13 +1607,8 @@ DEFINE_PER_CPU_FIRST(struct fixed_percpu_data, EXPORT_PER_CPU_SYMBOL_GPL(fixed_percpu_data); /* - * The following percpu variables are hot. Align current_task to - * cacheline size such that they fall in the same cacheline. + * The following percpu variables are hot. */ -DEFINE_PER_CPU(struct task_struct *, current_task) ____cacheline_aligned = - &init_task; -EXPORT_PER_CPU_SYMBOL(current_task); - DEFINE_PER_CPU(struct irq_stack *, hardirq_stack_ptr); DEFINE_PER_CPU(unsigned int, irq_count) __visible = -1; diff --git a/arch/x86/kernel/cpu/current.c b/arch/x86/kernel/cpu/current.c new file mode 100644 index 000000000000..3238c6e34984 --- /dev/null +++ b/arch/x86/kernel/cpu/current.c @@ -0,0 +1,16 @@ +#include +#include + +/* + * Align current_task to cacheline size such that they fall in the same + * cacheline. + */ +DEFINE_PER_CPU(struct task_struct *, current_task) ____cacheline_aligned = + &init_task; +EXPORT_PER_CPU_SYMBOL(current_task); + +#if USE_X86_SEG_SUPPORT +DECLARE_PER_CPU(struct task_struct * const __percpu_seg_override, const_current_task) + __attribute__((alias("current_task"))); +EXPORT_PER_CPU_SYMBOL(const_current_task); +#endif diff --git a/include/linux/compiler.h b/include/linux/compiler.h index f0fd5636fddb..1b6ee9ab6373 100644 --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -299,7 +299,7 @@ unsigned long read_word_at_a_time(const void *addr) */ #define __ADDRESSABLE(sym) \ static void * __section(".discard.addressable") __used \ - __PASTE(__addressable_##sym, __LINE__) = (void *)&sym; + __PASTE(__addressable_##sym, __LINE__) = (void *)(uintptr_t)&sym; /** * offset_to_ptr - convert a relative memory offset to an absolute pointer -- 2.17.1