Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp3522289pxj; Tue, 15 Jun 2021 02:49:12 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw1/LgsSP/r3RdJExavEKHl+F7/9ZL3iabFBnrwCmzQhqo1s+4siYdYyM5l9XCWJEcpD/a+ X-Received: by 2002:a17:907:9db:: with SMTP id bx27mr1386672ejc.136.1623750552627; Tue, 15 Jun 2021 02:49:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1623750552; cv=none; d=google.com; s=arc-20160816; b=Q0FwnVR+PmyePwknBpzitNpqnyypDVztyOLF9mCc5W3VT2rRj5QXi27VNlR/orQR9k iJzwIvmUM4nmFX5E8upSPwRQ5/ZkLMKWORgZW5r5YzgG4j71fRKKIkUmt6kvG4UxGRVB hJIJxbDAjdNG+tQqNyhBKrWJAv2ixN3vBaPnePnR7BZSOT9UpMO/1o0gAVOs3FvloXx2 R+3rw8YGrYIm7gFKdW/U1TCXe5dW3o47YJ36swZxb6nxW2yNqOA1OjYPrgimpknkriUU FRpoSyd4BJrAxWFdbsT7eAhPNcs3cC/WmIHilHGp9jUvxOsfDMWj2mGeN/Kjf4qYHVEN nEHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=UQ1PcT76AkwbLqtg9MXrtzm/FMmRjvgXmZ7XM4+5Hhc=; b=qfKFiXnTB0c07MzRmlt/mO7R/pFTegDwp9MTdHewJS6Gcxl7peH8ixUZel4hZUdAeu K4ouQuDdZXk17tF+SLJrqjlth75ENwGcXbna/4O9Jbc+wCJsHxih8HCqBqI/Qb4rLUpp 9qiVrPnf5zXF7TMzRoau2kwYc0HIDZj4Lq8ZYfJ++dJLIoSPDSsgru1s7foKCs8xOrJ6 L59AwBbtxfHIok7DVLLpvM8DfQxNeXvIwRWx1901CdVh0jZst3epZqMMgK8RX+4eaLd/ leWBDJTXYGcoEUbelm1bD4zFlZt8tWgfhOP9tXLSQu5kFlAeF9ZcVq6qgBoGmg+tojH9 laQg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=F51nHnqN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y8si11332862edw.365.2021.06.15.02.48.50; Tue, 15 Jun 2021 02:49:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=F51nHnqN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231343AbhFOJsu (ORCPT + 99 others); Tue, 15 Jun 2021 05:48:50 -0400 Received: from mail.kernel.org ([198.145.29.99]:58558 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231214AbhFOJst (ORCPT ); Tue, 15 Jun 2021 05:48:49 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id DF9E46143D; Tue, 15 Jun 2021 09:46:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1623750405; bh=Q4s9x4X/axe/W24+TWRuk/HxGxn+2TZijoQqtBdwBVk=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=F51nHnqNF5lzBBA9V3adfbhtW0cgxcadpe1e/aIV0ONTdKx/yImRdFUtaAiN9l4B0 AYtxDe4VSjXZDTUN6cxeMVpVoE18nqH9FJn0Yd3ufJ9JrKrn5ZhCu80rfJmZQUl/6C yCCqMMze3Zv6pSMFQ1rW5AjKc79NpvsSTp5CiA9eQX4RVO6VeiX8/sfOP9RpNF5S+f Ynou70nSy+uK8Vj/6QbbTF0jVGfWB2RiPLVhwB5HsdHw+RJou/JFycIVgQQN4HbaXZ aabHr9m1ja+uN0EjRrhfgCX1xnvjN+BaDVSv5i3daMMP3JDDmvUMu8Pm8q7DiF42xw 4sNTaai8ibW/A== Date: Tue, 15 Jun 2021 10:46:39 +0100 From: Will Deacon To: Jason Gunthorpe Cc: Hugh Dickins , "Kirill A. Shutemov" , Andrew Morton , "Kirill A. Shutemov" , Yang Shi , Wang Yugui , Matthew Wilcox , Alistair Popple , Ralph Campbell , Zi Yan , Peter Xu , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 03/11] mm: page_vma_mapped_walk(): use pmd_read_atomic() Message-ID: <20210615094639.GC19878@willie-the-truck> References: <589b358c-febc-c88e-d4c2-7834b37fa7bf@google.com> <594c1f0-d396-5346-1f36-606872cddb18@google.com> <20210610090617.e6qutzzj3jxcseyi@box.shutemov.name> <20210610121542.GQ1096940@ziepe.ca> <20210611153613.GR1096940@ziepe.ca> <939a0fa-7d6c-f535-7c34-4c522903e6f@google.com> <20210611194249.GS1096940@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210611194249.GS1096940@ziepe.ca> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 11, 2021 at 04:42:49PM -0300, Jason Gunthorpe wrote: > On Fri, Jun 11, 2021 at 12:05:42PM -0700, Hugh Dickins wrote: > > > diff --git a/arch/x86/include/asm/pgtable-3level.h b/arch/x86/include/asm/pgtable-3level.h > > > index e896ebef8c24cb..0bf1fdec928e71 100644 > > > +++ b/arch/x86/include/asm/pgtable-3level.h > > > @@ -75,7 +75,7 @@ static inline void native_set_pte(pte_t *ptep, pte_t pte) > > > static inline pmd_t pmd_read_atomic(pmd_t *pmdp) > > > { > > > pmdval_t ret; > > > - u32 *tmp = (u32 *)pmdp; > > > + u32 *tmp = READ_ONCE((u32 *)pmdp); > > > > > > ret = (pmdval_t) (*tmp); > > > if (ret) { > > > @@ -84,7 +84,7 @@ static inline pmd_t pmd_read_atomic(pmd_t *pmdp) > > > * or we can end up with a partial pmd. > > > */ > > > smp_rmb(); > > > - ret |= ((pmdval_t)*(tmp + 1)) << 32; > > > + ret |= READ_ONCE((pmdval_t)*(tmp + 1)) << 32; > > > } > > > > Maybe that. Or maybe now (since Will's changes) it can just do > > one READ_ONCE() of the whole, then adjust its local copy. > > I think the smb_rmb() is critical here to ensure a PTE table pointer > is coherent, READ_ONCE is not a substitute, unless I am miss > understanding what Will's changes are??? Yes, I agree that the barrier is needed here for x86 PAE. I would really have liked to enforce native-sized access in READ_ONCE(), but unfortunately there is plenty of code out there which is resilient to a 64-bit access being split into two separate 32-bit accesses and so I wasn't able to go that far. That being said, pmd_read_atomic() probably _should_ be using READ_ONCE() because using it inconsistently can give rise to broken codegen, e.g. if you do: pmdval_t x, y, z; x = *pmdp; // Invalid y = READ_ONCE(*pmdp); // Valid if (pmd_valid(y)) z = *pmdp; // Invalid again! Then the compiler can allocate the same register for x and z, but will issue an additional load for y. If a concurrent update takes place to the pmd which transitions from Invalid -> Valid, then it will look as though things went back in time, because z will be stale. We actually hit this on arm64 in practice [1]. Will [1] https://lore.kernel.org/lkml/20171003114244.430374928@linuxfoundation.org/