Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp3664337imm; Mon, 10 Sep 2018 23:23:19 -0700 (PDT) X-Google-Smtp-Source: ANB0VdZqBXNBeGMQgDzYZEkWmU4DR6sHvkXdlbVwebZBNpwdVdJc5Z654Z7WvLpRffx8Wm3JwA1O X-Received: by 2002:a62:e11:: with SMTP id w17-v6mr28027166pfi.242.1536646999161; Mon, 10 Sep 2018 23:23:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536646999; cv=none; d=google.com; s=arc-20160816; b=bj+Np0hbHRWH3CXL1/wsSAbEpTMbbO0duaRk2EJjCcwC0yWhjUYxGhCTYIGCZptqsm 508UWk9XD43079suHcquXaVdT+bES2bNgqLP/2qk9SkoM4MazyYzHQn1NkkQWzUwOhHW eJOwDAUwBXe08In8rVUpqAhk8ZgqCUkhQCncaVjdPNZiXJw/x4rnNCOljhXIzeElhNQ+ a/DM7EMzaLremp+apSwmRw1QwhTEwPFuCikhz9j7eW7wCzGMVR0agVlmMgpW6f3ibiXE mXGfvrGgumRXD/bIXz1TzFhmS2rfeXjUjV6FNwVSIKBssrzboLuhd0ofEOFvR4iEh6N3 bh1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-disposition :content-transfer-encoding:mime-version:robot-unsubscribe:robot-id :git-commit-id:subject:to:references:in-reply-to:reply-to:cc :message-id:from:date; bh=K+gPFEjyRXrBdSN+829KGqZIcUQl+tip0DMMN9UHrq8=; b=ibX2TJLlaFJ9/K+Dr34Eac5w4fCbIk8ysuEmBQZwym/ek1GtkK0Y9IKj3jp0rVdFtx 7VWi7z/51YtIxZVHtlz1bl2qzz7v7PinsKC24j8y66vZxTtEU8/svtOXUIWPcAzPQYbQ ZazB7gSLvoyxGUOKrCs1KAVs3vtJKQs0Ev+XLRg3MKCLB/vfH+XZM2KGl7tcygz3hdgg KU9HD2fdQbx7+OTY89C95cAY5ioJZ8xa1jZjI9hfwRUkLgoXndsTY5dqEZxIdPMtRazc vU9u/0wFikPSSjejDwLRLrIJC8Eo3yhbqgzIfDnJx61otYXmAuWop7uV1bdapMEVTEvY fKXA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a23-v6si19212799pgk.673.2018.09.10.23.23.04; Mon, 10 Sep 2018 23:23:19 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727686AbeIKLUY (ORCPT + 99 others); Tue, 11 Sep 2018 07:20:24 -0400 Received: from terminus.zytor.com ([198.137.202.136]:38051 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726423AbeIKLUY (ORCPT ); Tue, 11 Sep 2018 07:20:24 -0400 Received: from terminus.zytor.com (localhost [127.0.0.1]) by terminus.zytor.com (8.15.2/8.15.2) with ESMTPS id w8B6MJtI2135873 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 10 Sep 2018 23:22:19 -0700 Received: (from tipbot@localhost) by terminus.zytor.com (8.15.2/8.15.2/Submit) id w8B6MJQe2135870; Mon, 10 Sep 2018 23:22:19 -0700 Date: Mon, 10 Sep 2018 23:22:19 -0700 X-Authentication-Warning: terminus.zytor.com: tipbot set sender to tipbot@zytor.com using -f From: tip-bot for Mikulas Patocka Message-ID: Cc: mingo@kernel.org, hpa@zytor.com, peterz@infradead.org, tglx@linutronix.de, mpatocka@redhat.com, snitzer@redhat.com, linux-kernel@vger.kernel.org, torvalds@linux-foundation.org, dan.j.williams@intel.com, dm-devel@redhat.com Reply-To: peterz@infradead.org, hpa@zytor.com, tglx@linutronix.de, mpatocka@redhat.com, snitzer@redhat.com, mingo@kernel.org, dm-devel@redhat.com, linux-kernel@vger.kernel.org, dan.j.williams@intel.com, torvalds@linux-foundation.org In-Reply-To: References: To: linux-tip-commits@vger.kernel.org Subject: [tip:x86/asm] x86/asm: Optimize memcpy_flushcache() Git-Commit-ID: 02101c45ec5b19d607af7372680f5259050b4e9c X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00, T_DATE_IN_FUTURE_96_Q autolearn=ham autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on terminus.zytor.com Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Commit-ID: 02101c45ec5b19d607af7372680f5259050b4e9c Gitweb: https://git.kernel.org/tip/02101c45ec5b19d607af7372680f5259050b4e9c Author: Mikulas Patocka AuthorDate: Wed, 8 Aug 2018 17:22:16 -0400 Committer: Ingo Molnar CommitDate: Mon, 10 Sep 2018 15:17:12 +0200 x86/asm: Optimize memcpy_flushcache() I use memcpy_flushcache() in my persistent memory driver for metadata updates, there are many 8-byte and 16-byte updates and it turns out that the overhead of memcpy_flushcache causes 2% performance degradation compared to "movnti" instruction explicitly coded using inline assembler. The tests were done on a Skylake processor with persistent memory emulated using the "memmap" kernel parameter. dd was used to copy data to the dm-writecache target. This patch recognizes memcpy_flushcache calls with constant short length and turns them into inline assembler - so that I don't have to use inline assembler in the driver. Signed-off-by: Mikulas Patocka Cc: Dan Williams Cc: Linus Torvalds Cc: Mike Snitzer Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: device-mapper development Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1808081720460.24747@file01.intranet.prod.int.rdu2.redhat.com Signed-off-by: Ingo Molnar --- arch/x86/include/asm/string_64.h | 20 +++++++++++++++++++- arch/x86/lib/usercopy_64.c | 4 ++-- 2 files changed, 21 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/string_64.h b/arch/x86/include/asm/string_64.h index d33f92b9fa22..7ad41bfcc16c 100644 --- a/arch/x86/include/asm/string_64.h +++ b/arch/x86/include/asm/string_64.h @@ -149,7 +149,25 @@ memcpy_mcsafe(void *dst, const void *src, size_t cnt) #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE #define __HAVE_ARCH_MEMCPY_FLUSHCACHE 1 -void memcpy_flushcache(void *dst, const void *src, size_t cnt); +void __memcpy_flushcache(void *dst, const void *src, size_t cnt); +static __always_inline void memcpy_flushcache(void *dst, const void *src, size_t cnt) +{ + if (__builtin_constant_p(cnt)) { + switch (cnt) { + case 4: + asm ("movntil %1, %0" : "=m"(*(u32 *)dst) : "r"(*(u32 *)src)); + return; + case 8: + asm ("movntiq %1, %0" : "=m"(*(u64 *)dst) : "r"(*(u64 *)src)); + return; + case 16: + asm ("movntiq %1, %0" : "=m"(*(u64 *)dst) : "r"(*(u64 *)src)); + asm ("movntiq %1, %0" : "=m"(*(u64 *)(dst + 8)) : "r"(*(u64 *)(src + 8))); + return; + } + } + __memcpy_flushcache(dst, src, cnt); +} #endif #endif /* __KERNEL__ */ diff --git a/arch/x86/lib/usercopy_64.c b/arch/x86/lib/usercopy_64.c index 9c5606d88f61..c50a1d815a37 100644 --- a/arch/x86/lib/usercopy_64.c +++ b/arch/x86/lib/usercopy_64.c @@ -153,7 +153,7 @@ long __copy_user_flushcache(void *dst, const void __user *src, unsigned size) return rc; } -void memcpy_flushcache(void *_dst, const void *_src, size_t size) +void __memcpy_flushcache(void *_dst, const void *_src, size_t size) { unsigned long dest = (unsigned long) _dst; unsigned long source = (unsigned long) _src; @@ -216,7 +216,7 @@ void memcpy_flushcache(void *_dst, const void *_src, size_t size) clean_cache_range((void *) dest, size); } } -EXPORT_SYMBOL_GPL(memcpy_flushcache); +EXPORT_SYMBOL_GPL(__memcpy_flushcache); void memcpy_page_flushcache(char *to, struct page *page, size_t offset, size_t len)