Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752935AbdDKAks (ORCPT ); Mon, 10 Apr 2017 20:40:48 -0400 Received: from mga01.intel.com ([192.55.52.88]:48690 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752513AbdDKAkr (ORCPT ); Mon, 10 Apr 2017 20:40:47 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.37,184,1488873600"; d="scan'208";a="247003344" Subject: [PATCH v3] x86, pmem: fix broken __copy_user_nocache cache-bypass assumptions From: Dan Williams To: linux-nvdimm@ml01.01.org Cc: Jan Kara , Toshi Kani , Matthew Wilcox , x86@kernel.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, Christoph Hellwig , Jeff Moyer , Ingo Molnar , Al Viro , "H. Peter Anvin" , Thomas Gleixner , Ross Zwisler Date: Mon, 10 Apr 2017 17:35:02 -0700 Message-ID: <149187075199.40875.9829505688202257056.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.17.1-9-g687f MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3941 Lines: 105 Before we rework the "pmem api" to stop abusing __copy_user_nocache() for memcpy_to_pmem() we need to fix cases where we may strand dirty data in the cpu cache. The problem occurs when copy_from_iter_pmem() is used for arbitrary data transfers from userspace. There is no guarantee that these transfers, performed by dax_iomap_actor(), will have aligned destinations or aligned transfer lengths. Backstop the usage __copy_user_nocache() with explicit cache management in these unaligned cases. Yes, copy_from_iter_pmem() is now too big for an inline, but addressing that is saved for a later patch that moves the entirety of the "pmem api" into the pmem driver directly. Fixes: 5de490daec8b ("pmem: add copy_from_iter_pmem() and clear_pmem()") Cc: Cc: Cc: Jan Kara Cc: Jeff Moyer Cc: Ingo Molnar Cc: Christoph Hellwig Cc: "H. Peter Anvin" Cc: Al Viro Cc: Thomas Gleixner Cc: Matthew Wilcox Cc: Ross Zwisler Signed-off-by: Toshi Kani Signed-off-by: Dan Williams --- Changes in v3: * match the implementation to the notes at the top of __copy_user_nocache (Toshi) * Switch to using the IS_ALIGNED() macro to make alignment checks easier to read and harder to get wrong like they were in v2. (Toshi, Dan) arch/x86/include/asm/pmem.h | 42 +++++++++++++++++++++++++++++++----------- 1 file changed, 31 insertions(+), 11 deletions(-) diff --git a/arch/x86/include/asm/pmem.h b/arch/x86/include/asm/pmem.h index 2c1ebeb4d737..529bb4a6487a 100644 --- a/arch/x86/include/asm/pmem.h +++ b/arch/x86/include/asm/pmem.h @@ -55,7 +55,8 @@ static inline int arch_memcpy_from_pmem(void *dst, const void *src, size_t n) * @size: number of bytes to write back * * Write back a cache range using the CLWB (cache line write back) - * instruction. + * instruction. Note that @size is internally rounded up to be cache + * line size aligned. */ static inline void arch_wb_cache_pmem(void *addr, size_t size) { @@ -69,15 +70,6 @@ static inline void arch_wb_cache_pmem(void *addr, size_t size) clwb(p); } -/* - * copy_from_iter_nocache() on x86 only uses non-temporal stores for iovec - * iterators, so for other types (bvec & kvec) we must do a cache write-back. - */ -static inline bool __iter_needs_pmem_wb(struct iov_iter *i) -{ - return iter_is_iovec(i) == false; -} - /** * arch_copy_from_iter_pmem - copy data from an iterator to PMEM * @addr: PMEM destination address @@ -94,7 +86,35 @@ static inline size_t arch_copy_from_iter_pmem(void *addr, size_t bytes, /* TODO: skip the write-back by always using non-temporal stores */ len = copy_from_iter_nocache(addr, bytes, i); - if (__iter_needs_pmem_wb(i)) + /* + * In the iovec case on x86_64 copy_from_iter_nocache() uses + * non-temporal stores for the bulk of the transfer, but we need + * to manually flush if the transfer is unaligned. A cached + * memory copy is used when destination or size is not naturally + * aligned. That is: + * - Require 8-byte alignment when size is 8 bytes or larger. + * - Require 4-byte alignment when size is 4 bytes. + * + * In the non-iovec case the entire destination needs to be + * flushed. + */ + if (iter_is_iovec(i)) { + unsigned long flushed, dest = (unsigned long) addr; + + if (bytes < 8) { + if (!IS_ALIGNED(dest, 4) || (bytes != 4)) + arch_wb_cache_pmem(addr, 1); + } else { + if (!IS_ALIGNED(dest, 8)) { + dest = ALIGN(dest, boot_cpu_data.x86_clflush_size); + arch_wb_cache_pmem(addr, 1); + } + + flushed = dest - (unsigned long) addr; + if (bytes > flushed && !IS_ALIGNED(bytes - flushed, 8)) + arch_wb_cache_pmem(addr + bytes - 1, 1); + } + } else arch_wb_cache_pmem(addr, bytes); return len;