Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753229AbdDJSxu (ORCPT ); Mon, 10 Apr 2017 14:53:50 -0400 Received: from g9t1613g.houston.hpe.com ([15.241.32.99]:23507 "EHLO g9t1613g.houston.hpe.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751983AbdDJSxs (ORCPT ); Mon, 10 Apr 2017 14:53:48 -0400 From: "Kani, Toshimitsu" To: Dan Williams , "linux-nvdimm@lists.01.org" CC: Jan Kara , Matthew Wilcox , "x86@kernel.org" , "linux-kernel@vger.kernel.org" , "stable@vger.kernel.org" , Christoph Hellwig , Jeff Moyer , Ingo Molnar , Al Viro , "H. Peter Anvin" , Thomas Gleixner , Ross Zwisler Subject: RE: [PATCH v2] x86, pmem: fix broken __copy_user_nocache cache-bypass assumptions Thread-Topic: [PATCH v2] x86, pmem: fix broken __copy_user_nocache cache-bypass assumptions Thread-Index: AQHSr/2ySJko6sEzVEqxW9ti8glLH6G+8Z5A Date: Mon, 10 Apr 2017 18:53:42 +0000 Message-ID: References: <149161025237.38725.13508986873214668503.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <149161025237.38725.13508986873214668503.stgit@dwillia2-desk3.amr.corp.intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=none action=none header.from=hpe.com; x-originating-ip: [174.51.38.138] x-microsoft-exchange-diagnostics: 1;CS1PR84MB0296;7:BjBcitN2Rrdtr+BGrH9D8avoRd2aWWBueG6H8wmqWH3dDgpKFrFQXwzTdKsF0l4gx3r/65gK7U0bFeF6j73PwoGonVqUoJ0ZrudS+W4huSFB0o4EB7LPSPrCXjinT1ghUU0k+GFK4VmCUJfbFNuRuwg7KI4nhBgN+Rvfry3gV1/BJ1+j4tW94vcYSyf7FGgNLuF9bv8Lf+TkqDghEv1nPMa4RE51MEMZoRowCo/g3/pkzx47t52pelnsJQCAPWwCypQ5sTIbzNEvE0EC0Wa2U7v0OYf5OQL+S75uGbH2x6v5EETXmM92TuP4nU6ye/ZAnlfXfedRqgrCeSFVEf8yPQ== x-ms-office365-filtering-correlation-id: b9bdf9b1-cfce-4fc3-224c-08d48042e8f3 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001)(2017030254075)(48565401081)(201703131423075)(201703031133081);SRVR:CS1PR84MB0296; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(6040450)(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046)(93006095)(93001095)(6055026)(6041248)(20161123564025)(20161123560025)(20161123555025)(20161123562025)(201703131423075)(201702281528075)(201703061421075)(6072148);SRVR:CS1PR84MB0296;BCL:0;PCL:0;RULEID:;SRVR:CS1PR84MB0296; x-forefront-prvs: 027367F73D x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(6009001)(39450400003)(39400400002)(39840400002)(39860400002)(39850400002)(39410400002)(51914003)(122556002)(2900100001)(3846002)(6116002)(102836003)(38730400002)(33656002)(50986999)(6436002)(305945005)(9686003)(74316002)(6506006)(66066001)(77096006)(7736002)(54906002)(55016002)(8666007)(53936002)(7696004)(2950100002)(54356999)(76176999)(229853002)(8936002)(81166006)(8676002)(86362001)(3280700002)(3660700001)(4326008)(2906002)(7416002)(189998001)(25786009)(5660300001);DIR:OUT;SFP:1102;SCL:1;SRVR:CS1PR84MB0296;H:CS1PR84MB0294.NAMPRD84.PROD.OUTLOOK.COM;FPR:;SPF:None;MLV:sfv;LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 X-MS-Exchange-CrossTenant-originalarrivaltime: 10 Apr 2017 18:53:42.8965 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 105b2061-b669-4b31-92ac-24d304d195dc X-MS-Exchange-Transport-CrossTenantHeadersStamped: CS1PR84MB0296 X-OriginatorOrg: hpe.com Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id v3AIs0lx030836 Content-Length: 3431 Lines: 90 > Subject: [PATCH v2] x86, pmem: fix broken __copy_user_nocache cache- > bypass assumptions > > Before we rework the "pmem api" to stop abusing __copy_user_nocache() > for memcpy_to_pmem() we need to fix cases where we may strand dirty data > in the cpu cache. The problem occurs when copy_from_iter_pmem() is used > for arbitrary data transfers from userspace. There is no guarantee that > these transfers, performed by dax_iomap_actor(), will have aligned > destinations or aligned transfer lengths. Backstop the usage > __copy_user_nocache() with explicit cache management in these unaligned > cases. > > Yes, copy_from_iter_pmem() is now too big for an inline, but addressing > that is saved for a later patch that moves the entirety of the "pmem > api" into the pmem driver directly. : > --- > v2: Change the condition for flushing the last cacheline of the > destination from 8-byte to 4-byte misalignment (Toshi) : > arch/x86/include/asm/pmem.h | 41 ++++++++++++++++++++++++++++++---- : > @@ -94,7 +86,34 @@ static inline size_t arch_copy_from_iter_pmem(void > *addr, size_t bytes, > /* TODO: skip the write-back by always using non-temporal stores */ > len = copy_from_iter_nocache(addr, bytes, i); > > - if (__iter_needs_pmem_wb(i)) > + /* > + * In the iovec case on x86_64 copy_from_iter_nocache() uses > + * non-temporal stores for the bulk of the transfer, but we need > + * to manually flush if the transfer is unaligned. In the > + * non-iovec case the entire destination needs to be flushed. > + */ > + if (iter_is_iovec(i)) { > + unsigned long dest = (unsigned long) addr; > + > + /* > + * If the destination is not 8-byte aligned then > + * __copy_user_nocache (on x86_64) uses cached copies > + */ > + if (dest & 8) { > + arch_wb_cache_pmem(addr, 1); > + dest = ALIGN(dest, 8); > + } > + > + /* > + * If the remaining transfer length, after accounting > + * for destination alignment, is not 4-byte aligned > + * then __copy_user_nocache() falls back to cached > + * copies for the trailing bytes in the final cacheline > + * of the transfer. > + */ > + if ((bytes - (dest - (unsigned long) addr)) & 4) > + arch_wb_cache_pmem(addr + bytes - 1, 1); > + } else > arch_wb_cache_pmem(addr, bytes); > > return len; Thanks for the update. I think the alignment check should be based on the following note in copy_user_nocache. * Note: Cached memory copy is used when destination or size is not * naturally aligned. That is: * - Require 8-byte alignment when size is 8 bytes or larger. * - Require 4-byte alignment when size is 4 bytes. So, I think the code may be something like this. I also made the following changes: - Mask with 7, not 8. - ALIGN with cacheline size, instead of 8. - Add (bytes > flushed) test since calculation with unsigned long still results in a negative value (as a positive value). if (bytes < 8) { if ((dest & 3) || (bytes != 4)) arch_wb_cache_pmem(addr, 1); } else { if (dest & 7) { dest = ALIGN(dest, boot_cpu_data.x86_clflush_size); arch_wb_cache_pmem(addr, 1); } flushed = dest - (unsigned long) addr; if ((bytes > flushed) && ((bytes - flushed) & 7)) arch_wb_cache_pmem(addr + bytes - 1, 1); } Thanks, -Toshi