Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752500AbdDJVpy (ORCPT ); Mon, 10 Apr 2017 17:45:54 -0400 Received: from g2t2354.austin.hpe.com ([15.233.44.27]:44908 "EHLO g2t2354.austin.hpe.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751140AbdDJVpw (ORCPT ); Mon, 10 Apr 2017 17:45:52 -0400 X-Greylist: delayed 1006 seconds by postgrey-1.27 at vger.kernel.org; Mon, 10 Apr 2017 17:45:52 EDT From: "Kani, Toshimitsu" To: Dan Williams CC: "linux-nvdimm@lists.01.org" , Jan Kara , Matthew Wilcox , "x86@kernel.org" , "linux-kernel@vger.kernel.org" , "stable@vger.kernel.org" , Christoph Hellwig , Jeff Moyer , Ingo Molnar , Al Viro , "H. Peter Anvin" , Thomas Gleixner , Ross Zwisler Subject: RE: [PATCH v2] x86, pmem: fix broken __copy_user_nocache cache-bypass assumptions Thread-Topic: [PATCH v2] x86, pmem: fix broken __copy_user_nocache cache-bypass assumptions Thread-Index: AQHSr/2ySJko6sEzVEqxW9ti8glLH6G+8Z5AgAAtH4CAAAMYAIAAAvcAgAABTfA= Date: Mon, 10 Apr 2017 21:45:47 +0000 Message-ID: References: <149161025237.38725.13508986873214668503.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=none action=none header.from=hpe.com; x-originating-ip: [174.51.38.138] x-microsoft-exchange-diagnostics: 1;CS1PR84MB0296;7:dtjRrhie/ueTHBPDk/b/hh6iQSVOXgcX4OebKeqlVQzd8oicSJs1otEMVGDZRkGRTethSn9DsxFNZjdoz4RntHOgHyU+Ds6qDlGPpNRyUZyC1WQtLF7Ibl5TwqIOWd6J8wlkskugOtW6xJrb0WFs2EH/LYQy5Nu0vlwfSEac1NLUKo0Hw3+B565kHgHPRsU8/QWJ9nwzpaBiCt5XulZ572HyC8duMmpWMF8wR0MmFtDp8dFLu2EL38GkYecB0agg4rcB689ePBGaT+x+AdOERlG1JEM7nv7GAoeRIug8Fta5w7+05CrfD+9U9XOeXhy1FM/Tk5TcYrWAjMUX+GWk7w== x-ms-office365-filtering-correlation-id: 77e74dd7-0fbb-4a16-0a5c-08d4805af2ee x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001)(2017030254075)(48565401081)(201703131423075)(201703031133081);SRVR:CS1PR84MB0296; x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(227479698468861); x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(6040450)(601004)(2401047)(8121501046)(5005006)(93006095)(93001095)(10201501046)(3002001)(6055026)(6041248)(201703131423075)(201702281528075)(201703061421075)(20161123555025)(20161123562025)(20161123560025)(20161123564025)(6072148);SRVR:CS1PR84MB0296;BCL:0;PCL:0;RULEID:;SRVR:CS1PR84MB0296; x-forefront-prvs: 027367F73D x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(6009001)(39450400003)(39400400002)(39840400002)(39410400002)(39860400002)(39850400002)(51914003)(24454002)(377454003)(110136004)(122556002)(2900100001)(93886004)(3846002)(6116002)(102836003)(38730400002)(33656002)(305945005)(6436002)(9686003)(74316002)(77096006)(66066001)(6506006)(7736002)(50986999)(55016002)(53936002)(8666007)(54906002)(7696004)(2950100002)(54356999)(76176999)(6916009)(229853002)(8936002)(86362001)(81166006)(8676002)(3660700001)(3280700002)(4326008)(2906002)(53546009)(7416002)(189998001)(5660300001)(25786009);DIR:OUT;SFP:1102;SCL:1;SRVR:CS1PR84MB0296;H:CS1PR84MB0294.NAMPRD84.PROD.OUTLOOK.COM;FPR:;SPF:None;MLV:sfv;LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 X-MS-Exchange-CrossTenant-originalarrivaltime: 10 Apr 2017 21:45:47.5419 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 105b2061-b669-4b31-92ac-24d304d195dc X-MS-Exchange-Transport-CrossTenantHeadersStamped: CS1PR84MB0296 X-OriginatorOrg: hpe.com Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id v3ALk4Cd015941 Content-Length: 1946 Lines: 57 > On Mon, Apr 10, 2017 at 2:28 PM, Kani, Toshimitsu > wrote: > >> > Thanks for the update. I think the alignment check should be based on > >> > the following note in copy_user_nocache. > >> > > >> > * Note: Cached memory copy is used when destination or size is not > >> > * naturally aligned. That is: > >> > * - Require 8-byte alignment when size is 8 bytes or larger. > >> > * - Require 4-byte alignment when size is 4 bytes. > >> > > >> > So, I think the code may be something like this. I also made the following > >> changes: > >> > >> Thanks! > >> > >> > - Mask with 7, not 8. > >> > >> Yes, good catch. > >> > >> > - ALIGN with cacheline size, instead of 8. > >> > - Add (bytes > flushed) test since calculation with unsigned long still > results > >> in a negative > >> > value (as a positive value). > >> > > >> > if (bytes < 8) { > >> > if ((dest & 3) || (bytes != 4)) > >> > arch_wb_cache_pmem(addr, 1); > >> > } else { > >> > if (dest & 7) { > >> > dest = ALIGN(dest, boot_cpu_data.x86_clflush_size); > >> > >> Why align the destination to the next cacheline? As far as I can see > >> the ALIGN_DESTINATION macro in arch/x86/include/asm/asm.h only aligns > >> to the next 8-byte boundary. > > > > The clflush here flushes for the cacheline size. So, we do not need to flush > > the same cacheline again when the unaligned tail is in the same line. > > Ok, makes sense. Last question, can't we reduce the check to be: > > if ((bytes > flushed) && ((bytes - flushed) & 3)) > > ...since if 'bytes' was 4-byte aligned we would have performed > non-temporal stores. That is not documented behavior of copy_user_nocache, but as long as the pmem version of copy_user_nocache follows the same implemented behavior, yes, that works. > Can I add your Signed-off-by: on v3? Sure. Thanks, -Toshi