Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934048AbcCIT41 (ORCPT ); Wed, 9 Mar 2016 14:56:27 -0500 Received: from e19.ny.us.ibm.com ([129.33.205.209]:43118 "EHLO e19.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933972AbcCIT4S (ORCPT ); Wed, 9 Mar 2016 14:56:18 -0500 X-IBM-Helo: d01dlp01.pok.ibm.com X-IBM-MailFrom: aneesh.kumar@linux.vnet.ibm.com X-IBM-RcptTo: linux-kernel@vger.kernel.org From: "Aneesh Kumar K.V" To: Anshuman Khandual , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org Cc: hughd@google.com, kirill@shutemov.name, n-horiguchi@ah.jp.nec.com, akpm@linux-foundation.org, mgorman@techsingularity.net, mpe@ellerman.id.au Subject: Re: [RFC 4/9] powerpc/mm: Split huge_pte_alloc function for BOOK3S 64K In-Reply-To: <1457525450-4262-4-git-send-email-khandual@linux.vnet.ibm.com> References: <1457525450-4262-1-git-send-email-khandual@linux.vnet.ibm.com> <1457525450-4262-4-git-send-email-khandual@linux.vnet.ibm.com> User-Agent: Notmuch/0.20.2 (http://notmuchmail.org) Emacs/24.5.1 (x86_64-pc-linux-gnu) Date: Thu, 10 Mar 2016 01:25:53 +0530 Message-ID: <878u1r1l4m.fsf@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16030919-0057-0000-0000-000003A95293 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4221 Lines: 137 Anshuman Khandual writes: > [ text/plain ] > From: root > > Currently the 'huge_pte_alloc' function has two versions, one for the > BOOK3S and the other one for the BOOK3E platforms. This change splits > the BOOK3S version into two parts, one for the 4K page size based > implementation and the other one for the 64K page sized implementation. > This change is one of the prerequisites towards enabling GENERAL_HUGETLB > implementation for BOOK3S 64K based huge pages. I really wish we reduce #ifdefs in C code and start splitting hash and nonhash code out where ever we can. What we really want here is a book3s version and in book3s version use powerpc specific huge_pte_alloc only if GENERAL_HUGETLB was not defined. Don't limit it to 64k linux page size. We should select between powerpc specific implementation and generic code using GENERAL_HUGETLB define. > > Signed-off-by: Anshuman Khandual > --- > arch/powerpc/mm/hugetlbpage.c | 67 +++++++++++++++++++++++++++---------------- > 1 file changed, 43 insertions(+), 24 deletions(-) > > diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c > index 744e24b..a49c6ae 100644 > --- a/arch/powerpc/mm/hugetlbpage.c > +++ b/arch/powerpc/mm/hugetlbpage.c > @@ -59,6 +59,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr) > return __find_linux_pte_or_hugepte(mm->pgd, addr, NULL, NULL); > } > > +#if !defined(CONFIG_PPC_64K_PAGES) || !defined(CONFIG_PPC_BOOK3S_64) > static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp, > unsigned long address, unsigned pdshift, unsigned pshift) > { > @@ -117,6 +118,7 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp, > spin_unlock(&mm->page_table_lock); > return 0; > } > +#endif /* !defined(CONFIG_PPC_64K_PAGES) || !defined(CONFIG_PPC_BOOK3S_64) */ > > /* > * These macros define how to determine which level of the page table holds > @@ -131,6 +133,7 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp, > #endif > > #ifdef CONFIG_PPC_BOOK3S_64 > +#ifdef CONFIG_PPC_4K_PAGES > /* > * At this point we do the placement change only for BOOK3S 64. This would > * possibly work on other subarchs. > @@ -146,32 +149,23 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz > > addr &= ~(sz-1); > pg = pgd_offset(mm, addr); > - > - if (pshift == PGDIR_SHIFT) > - /* 16GB huge page */ > - return (pte_t *) pg; > - else if (pshift > PUD_SHIFT) > - /* > - * We need to use hugepd table > - */ > + if (pshift > PUD_SHIFT) { > hpdp = (hugepd_t *)pg; > - else { > - pdshift = PUD_SHIFT; > - pu = pud_alloc(mm, pg, addr); > - if (pshift == PUD_SHIFT) > - return (pte_t *)pu; > - else if (pshift > PMD_SHIFT) > - hpdp = (hugepd_t *)pu; > - else { > - pdshift = PMD_SHIFT; > - pm = pmd_alloc(mm, pu, addr); > - if (pshift == PMD_SHIFT) > - /* 16MB hugepage */ > - return (pte_t *)pm; > - else > - hpdp = (hugepd_t *)pm; > - } > + goto hugepd_search; > + } > + > + pdshift = PUD_SHIFT; > + pu = pud_alloc(mm, pg, addr); > + if (pshift > PMD_SHIFT) { > + hpdp = (hugepd_t *)pu; > + goto hugepd_search; > } > + > + pdshift = PMD_SHIFT; > + pm = pmd_alloc(mm, pu, addr); > + hpdp = (hugepd_t *)pm; > + > +hugepd_search: > if (!hpdp) > return NULL; > > @@ -184,6 +178,31 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz > } > > #else > +pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz) > +{ > + pgd_t *pg; > + pud_t *pu; > + pmd_t *pm; > + unsigned pshift = __ffs(sz); > + > + addr &= ~(sz-1); > + pg = pgd_offset(mm, addr); > + > + if (pshift == PGDIR_SHIFT) /* 16GB Huge Page */ > + return (pte_t *)pg; > + > + pu = pud_alloc(mm, pg, addr); /* NA, skipped */ > + if (pshift == PUD_SHIFT) > + return (pte_t *)pu; > + > + pm = pmd_alloc(mm, pu, addr); /* 16MB Huge Page */ > + if (pshift == PMD_SHIFT) > + return (pte_t *)pm; > + > + return NULL; > +} > +#endif > +#else > > pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz) > { > -- > 2.1.0