Received: by 2002:a25:f815:0:0:0:0:0 with SMTP id u21csp3126052ybd; Fri, 28 Jun 2019 03:20:38 -0700 (PDT) X-Google-Smtp-Source: APXvYqx58JP9OJsMFjhxv7KJdJgRWEM/iin5r2wY4T+mhUAgAcgUdPiYxFH0HqQadLJEUUJY6KuI X-Received: by 2002:a17:90a:730b:: with SMTP id m11mr12050408pjk.89.1561717238181; Fri, 28 Jun 2019 03:20:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1561717238; cv=none; d=google.com; s=arc-20160816; b=Qz04k9/TlJc/J19vmz5SapI8vUQ4o1oX5dmOWeOiF9QyecABx4GhsPaZBdMABCI3Ub yIsFwYD/ASDnYyDDlQZZbXujL6rbeO1AwTy7JxRWGkyJ0l+oRoGfiGpqLIgQvNR5gXEG QE+4FjzIS5bU7r4arV6b7kOp/R7ZjDQCWyN13WmZYtJHf9zyJ6xwOYKI5gy3JqnKnmlg xswfuk2SGcX3IdEIfCtis2k13XUJdvy3AnKArxi8b+ETNLD/09t0+D3arajqvde1Liol muXzAce7y5fRhmpz/BraM1kvgdzUYqRY91AZkswpBE1HQDW8MMKGzrixZhH4yrl608NH osaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=QZ0ly4oUftyR5ySntFBwDP3USgH45/JIGMJo3GowvVQ=; b=y8CeirrIEu/h8J7P4rXc3DxRdNAPqDIP+ppXL/9nh3CDXI1YAR/tl4w7zTJDC7eYID R9tcRHXOigsETierGRefjQqHCwtB5m1UG/yehcanXcnn5nJ55PHfO2FZohEzzBUo0+xK ZLTpS7/nephYuuCM2dv/M7QTf0nJlWLbL4mMoLWPrrSz0iZS+GRvPF3kkpdEkwy+H4dS w0L3BvolprpkHMUu+tr5HnwYKKdD9LDc70iBSAenBRVW336uOL/7i4cn4u60a80lEL8J A4ilLuSLiER2iJjim5rR7YAoVzcBjVyerqrkTlDHvdxEtmgOthpWMu5c3zLTYkfEaFsP DdXw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x10si1850571pfm.71.2019.06.28.03.20.22; Fri, 28 Jun 2019 03:20:38 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726587AbfF1KUI (ORCPT + 99 others); Fri, 28 Jun 2019 06:20:08 -0400 Received: from foss.arm.com ([217.140.110.172]:44402 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726465AbfF1KUI (ORCPT ); Fri, 28 Jun 2019 06:20:08 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1ED6828; Fri, 28 Jun 2019 03:20:07 -0700 (PDT) Received: from arrakis.emea.arm.com (arrakis.cambridge.arm.com [10.1.196.78]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id F05873F718; Fri, 28 Jun 2019 03:20:05 -0700 (PDT) Date: Fri, 28 Jun 2019 11:20:03 +0100 From: Catalin Marinas To: Anshuman Khandual Cc: linux-mm@kvack.org, Will Deacon , Mark Rutland , Marc Zyngier , Suzuki Poulose , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [RFC 1/2] arm64/mm: Change THP helpers to comply with generic MM semantics Message-ID: <20190628102003.GA56463@arrakis.emea.arm.com> References: <1561639696-16361-1-git-send-email-anshuman.khandual@arm.com> <1561639696-16361-2-git-send-email-anshuman.khandual@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1561639696-16361-2-git-send-email-anshuman.khandual@arm.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Anshuman, On Thu, Jun 27, 2019 at 06:18:15PM +0530, Anshuman Khandual wrote: > pmd_present() and pmd_trans_huge() are expected to behave in the following > manner during various phases of a given PMD. It is derived from a previous > detailed discussion on this topic [1] and present THP documentation [2]. > > pmd_present(pmd): > > - Returns true if pmd refers to system RAM with a valid pmd_page(pmd) > - Returns false if pmd does not refer to system RAM - Invalid pmd_page(pmd) > > pmd_trans_huge(pmd): > > - Returns true if pmd refers to system RAM and is a trans huge mapping > > ------------------------------------------------------------------------- > | PMD states | pmd_present | pmd_trans_huge | > ------------------------------------------------------------------------- > | Mapped | Yes | Yes | > ------------------------------------------------------------------------- > | Splitting | Yes | Yes | > ------------------------------------------------------------------------- > | Migration/Swap | No | No | > ------------------------------------------------------------------------- Before we actually start fixing this, I would strongly suggest that you add a boot selftest (see lib/Kconfig.debug for other similar cases) which checks the consistency of the page table macros w.r.t. the expected mm semantics. Once the mm maintainers agreed with the semantics, it will really help architecture maintainers in implementing them correctly. You wouldn't need actual page tables, just things like assertions on pmd_trans_huge(pmd_mkhuge(pmd)) == true. You could go further and have checks on pmdp_invalidate(&dummy_vma, dummy_addr, &dummy_pmd) with the dummy_* variables on the stack. > The problem: > > PMD is first invalidated with pmdp_invalidate() before it's splitting. This > invalidation clears PMD_SECT_VALID as below. > > PMD Split -> pmdp_invalidate() -> pmd_mknotpresent -> Clears PMD_SECT_VALID > > Once PMD_SECT_VALID gets cleared, it results in pmd_present() return false > on the PMD entry. I think that's an inconsistency in the expected semantics here. Do you mean that pmd_present(pmd_mknotpresent(pmd)) should be true? If not, do we need to implement our own pmdp_invalidate() or change the generic one to set a "special" bit instead of just a pmd_mknotpresent? > +static inline int pmd_present(pmd_t pmd) > +{ > + if (pte_present(pmd_pte(pmd))) > + return 1; > + > + return pte_special(pmd_pte(pmd)); > +} [...] > +static inline pmd_t pmd_mknotpresent(pmd_t pmd) > +{ > + pmd = pte_pmd(pte_mkspecial(pmd_pte(pmd))); > + return __pmd(pmd_val(pmd) & ~PMD_SECT_VALID); > +} I'm not sure I agree with the semantics here where pmd_mknotpresent() does not actually make pmd_present() == false. -- Catalin