Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752298AbdFSV5p (ORCPT ); Mon, 19 Jun 2017 17:57:45 -0400 Received: from mail-wm0-f66.google.com ([74.125.82.66]:34216 "EHLO mail-wm0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751835AbdFSV5l (ORCPT ); Mon, 19 Jun 2017 17:57:41 -0400 Date: Tue, 20 Jun 2017 00:57:37 +0300 From: "Kirill A. Shutemov" To: Nadav Amit Cc: "Kirill A. Shutemov" , Andrew Morton , Vlastimil Babka , Vineet Gupta , Russell King , Will Deacon , Catalin Marinas , Ralf Baechle , "David S. Miller" , "Aneesh Kumar K . V" , Martin Schwidefsky , Heiko Carstens , Andrea Arcangeli , linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Ingo Molnar , "H . Peter Anvin" , Thomas Gleixner Subject: Re: [PATCHv2 1/3] x86/mm: Provide pmdp_establish() helper Message-ID: <20170619215737.hmjb23oafasig6rf@node.shutemov.name> References: <20170615145224.66200-1-kirill.shutemov@linux.intel.com> <20170615145224.66200-2-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170609 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1487 Lines: 39 On Mon, Jun 19, 2017 at 10:11:35AM -0700, Nadav Amit wrote: > Kirill A. Shutemov wrote: > > > We need an atomic way to setup pmd page table entry, avoiding races with > > CPU setting dirty/accessed bits. This is required to implement > > pmdp_invalidate() that doesn't loose these bits. > > > > On PAE we have to use cmpxchg8b as we cannot assume what is value of new pmd and > > setting it up half-by-half can expose broken corrupted entry to CPU. > > ... > > > > > +#ifndef pmdp_establish > > +#define pmdp_establish pmdp_establish > > +static inline pmd_t pmdp_establish(pmd_t *pmdp, pmd_t pmd) > > +{ > > + if (IS_ENABLED(CONFIG_SMP)) { > > + return xchg(pmdp, pmd); > > + } else { > > + pmd_t old = *pmdp; > > + *pmdp = pmd; > > I think you may want to use WRITE_ONCE() here - otherwise nobody guarantees > that the compiler will not split writes to *pmdp. Although the kernel uses > similar code to setting PTEs and PMDs, I think that it is best to start > fixing it. Obviously, you might need a different code path for 32-bit > kernels. This code is for 2-level pageing on 32-bit machines and for 4-level paging on 64-bit machine. In both cases sizeof(pmd_t) == sizeof(unsigned long). Sane compiler can't screw up anything here -- store of long is one shot. Compiler still can issue duplicate of store, but there's no harm. It guaranteed to be stable once ptl is released and CPU can't the entry half-updated. -- Kirill A. Shutemov