Received: by 10.223.185.116 with SMTP id b49csp176484wrg; Sat, 10 Feb 2018 05:00:13 -0800 (PST) X-Google-Smtp-Source: AH8x227kA2Lw1PvZpkFXsrnYV0DARbkoc3eGMFQyRNQ5xvUPGoXwK0iWlc+y5VxZYdGlCU1rodse X-Received: by 10.98.233.10 with SMTP id j10mr6078783pfh.123.1518267612971; Sat, 10 Feb 2018 05:00:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518267612; cv=none; d=google.com; s=arc-20160816; b=usA13+sQXjsjoJfKF+KlhxsOBvll6rQq6EX0rnujwzQM7WcbEGnsYxzc3Ksx3IYugK 2roOqf4IipZKWYcFL1+17mTX96/fJMt3rtrQs4DURSDNk4shZW93OutcFFXayakYxoYh A6oeg8oDNHSkcDe2pXsciphD+rE04clZI9PJsGVa4rFNRIOfAejjsPVTtUnENd0Jv5rL GqcAMgtmYF3TQHHjiSR8CXKM+M0b0cq+6QppgEjaalGM9iylkwcykLEKubOcEqFKm0vR aV0Zu0f2Ws0ot/lI1HyeXwo/HsQ8RDD5htL+j6p3YbfsV7SBGzMYaTDA0MjKA2tS4lsF tWgw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=mbUlX4x+xplWZfwZt8vLGoDaOKO15wrwsvNOlkt1M7s=; b=CSoxB9Y5f/Tu5YSsE82SL7LZSpfD+n9GOM5MJuDvZDJU5TXCNsIoQ/hKrdyHoFelwv iOLtGtUUm7TkgdrUxhr92IWw6pqlM52nO5+UC0jDKnTrvCuBRxm7kklnEGSxcvltnw8K Rx4vkmnFRKRxWCBk3UWAL5Fcbxqv0LSEgdzk6EaDZwPQMrbUi1cKjiPQlyfN31bL6igJ wFd8xVVh6EgoTX0UFUY4x/e4OhBIIZUjS0lUcElcS+2j1O2bBNgUGesjsq5P5+zCuP/K zvGArCGA6dN4u7WrfMeIXGc+HE7PvB3ArM8cZ0N1HVNsGtyUDaiM5ZVR/vddNE3Pk5DO kRxA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i6si2679267pgv.412.2018.02.10.04.59.58; Sat, 10 Feb 2018 05:00:12 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751145AbeBJM6H (ORCPT + 99 others); Sat, 10 Feb 2018 07:58:07 -0500 Received: from pegase1.c-s.fr ([93.17.236.30]:51544 "EHLO pegase1.c-s.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750835AbeBJM6G (ORCPT ); Sat, 10 Feb 2018 07:58:06 -0500 Received: from localhost (mailhub1-int [192.168.12.234]) by localhost (Postfix) with ESMTP id 3zdsRc0yQRz9ttwN; Sat, 10 Feb 2018 13:58:04 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [192.168.12.234]) (amavisd-new, port 10024) with ESMTP id AaxHTMMEVSFO; Sat, 10 Feb 2018 13:58:04 +0100 (CET) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 3zdsRb6pjXz9ttwK; Sat, 10 Feb 2018 13:58:03 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id F31DB8B887; Sat, 10 Feb 2018 13:58:04 +0100 (CET) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id Yzo_9lQxWc6L; Sat, 10 Feb 2018 13:58:04 +0100 (CET) Received: from PO15451 (unknown [192.168.232.3]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 40BD98B883; Sat, 10 Feb 2018 13:58:04 +0100 (CET) Subject: Re: [PATCH v3 4/5] powerpc/mm: Allow up to 64 low slices To: "Aneesh Kumar K.V" , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Scott Wood Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org References: <6920f6efe2dcdabf59350b2d31ee6bd4bdef57f4.1516783089.git.christophe.leroy@c-s.fr> <5dfafb3f0e2438e43f44917ffcf70e3daa4f37ee.1516783089.git.christophe.leroy@c-s.fr> <87po5t18ll.fsf@linux.vnet.ibm.com> From: Christophe LEROY Message-ID: <2e59ae26-ba19-5416-b8b3-d51517c280b4@c-s.fr> Date: Sat, 10 Feb 2018 13:58:04 +0100 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <87po5t18ll.fsf@linux.vnet.ibm.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: fr Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Le 29/01/2018 à 07:29, Aneesh Kumar K.V a écrit : > Christophe Leroy writes: > >> While the implementation of the "slices" address space allows >> a significant amount of high slices, it limits the number of >> low slices to 16 due to the use of a single u64 low_slices_psize >> element in struct mm_context_t >> >> On the 8xx, the minimum slice size is the size of the area >> covered by a single PMD entry, ie 4M in 4K pages mode and 64M in >> 16K pages mode. This means we could have at least 64 slices. >> >> In order to override this limitation, this patch switches the >> handling of low_slices_psize to char array as done already for >> high_slices_psize. This allows to increase the number of low >> slices to 64 on the 8xx. >> > > Maybe update the subject to "make low slice also a bitmap".Also indicate > that the bitmap functions optimize the operation if the bitmapsize les > <= long ? As explained below above the ---, this new version don't do that anymore, as 64 slices is enough. > > Also switch the 8xx to higher value in the another patch? Ok Christophe > >> Signed-off-by: Christophe Leroy >> --- >> v2: Usign slice_bitmap_xxx() macros instead of bitmap_xxx() functions. >> v3: keep low_slices as a u64, this allows 64 slices which is enough. >> >> arch/powerpc/include/asm/book3s/64/mmu.h | 3 +- >> arch/powerpc/include/asm/mmu-8xx.h | 7 +++- >> arch/powerpc/include/asm/paca.h | 2 +- >> arch/powerpc/include/asm/slice.h | 1 - >> arch/powerpc/include/asm/slice_32.h | 2 ++ >> arch/powerpc/include/asm/slice_64.h | 2 ++ >> arch/powerpc/kernel/paca.c | 3 +- >> arch/powerpc/mm/hash_utils_64.c | 13 ++++---- >> arch/powerpc/mm/slb_low.S | 8 +++-- >> arch/powerpc/mm/slice.c | 57 +++++++++++++++++--------------- >> 10 files changed, 56 insertions(+), 42 deletions(-) >> >> diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h >> index c9448e19847a..b076a2d74c69 100644 >> --- a/arch/powerpc/include/asm/book3s/64/mmu.h >> +++ b/arch/powerpc/include/asm/book3s/64/mmu.h >> @@ -91,7 +91,8 @@ typedef struct { >> struct npu_context *npu_context; >> >> #ifdef CONFIG_PPC_MM_SLICES >> - u64 low_slices_psize; /* SLB page size encodings */ >> + /* SLB page size encodings*/ >> + unsigned char low_slices_psize[BITS_PER_LONG / BITS_PER_BYTE]; >> unsigned char high_slices_psize[SLICE_ARRAY_SIZE]; >> unsigned long slb_addr_limit; >> #else >> diff --git a/arch/powerpc/include/asm/mmu-8xx.h b/arch/powerpc/include/asm/mmu-8xx.h >> index 5f89b6010453..5f37ba06b56c 100644 >> --- a/arch/powerpc/include/asm/mmu-8xx.h >> +++ b/arch/powerpc/include/asm/mmu-8xx.h >> @@ -164,6 +164,11 @@ >> */ >> #define SPRN_M_TW 799 >> >> +#ifdef CONFIG_PPC_MM_SLICES >> +#include >> +#define SLICE_ARRAY_SIZE (1 << (32 - SLICE_LOW_SHIFT - 1)) >> +#endif >> + >> #ifndef __ASSEMBLY__ >> typedef struct { >> unsigned int id; >> @@ -171,7 +176,7 @@ typedef struct { >> unsigned long vdso_base; >> #ifdef CONFIG_PPC_MM_SLICES >> u16 user_psize; /* page size index */ >> - u64 low_slices_psize; /* page size encodings */ >> + unsigned char low_slices_psize[SLICE_ARRAY_SIZE]; >> unsigned char high_slices_psize[0]; >> unsigned long slb_addr_limit; >> #endif >> diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h >> index 23ac7fc0af23..a3e531fe9ac7 100644 >> --- a/arch/powerpc/include/asm/paca.h >> +++ b/arch/powerpc/include/asm/paca.h >> @@ -141,7 +141,7 @@ struct paca_struct { >> #ifdef CONFIG_PPC_BOOK3S >> mm_context_id_t mm_ctx_id; >> #ifdef CONFIG_PPC_MM_SLICES >> - u64 mm_ctx_low_slices_psize; >> + unsigned char mm_ctx_low_slices_psize[BITS_PER_LONG / BITS_PER_BYTE]; >> unsigned char mm_ctx_high_slices_psize[SLICE_ARRAY_SIZE]; >> unsigned long mm_ctx_slb_addr_limit; >> #else >> diff --git a/arch/powerpc/include/asm/slice.h b/arch/powerpc/include/asm/slice.h >> index 2b4b70de7e71..b67ba8faa507 100644 >> --- a/arch/powerpc/include/asm/slice.h >> +++ b/arch/powerpc/include/asm/slice.h >> @@ -16,7 +16,6 @@ >> #define HAVE_ARCH_UNMAPPED_AREA >> #define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN >> >> -#define SLICE_LOW_SHIFT 28 >> #define SLICE_LOW_TOP (0x100000000ull) >> #define SLICE_NUM_LOW (SLICE_LOW_TOP >> SLICE_LOW_SHIFT) >> #define GET_LOW_SLICE_INDEX(addr) ((addr) >> SLICE_LOW_SHIFT) >> diff --git a/arch/powerpc/include/asm/slice_32.h b/arch/powerpc/include/asm/slice_32.h >> index 7e27c0dfb913..349187c20100 100644 >> --- a/arch/powerpc/include/asm/slice_32.h >> +++ b/arch/powerpc/include/asm/slice_32.h >> @@ -2,6 +2,8 @@ >> #ifndef _ASM_POWERPC_SLICE_32_H >> #define _ASM_POWERPC_SLICE_32_H >> >> +#define SLICE_LOW_SHIFT 26 /* 64 slices */ >> + >> #define SLICE_HIGH_SHIFT 0 >> #define SLICE_NUM_HIGH 0ul >> #define GET_HIGH_SLICE_INDEX(addr) (addr & 0) >> diff --git a/arch/powerpc/include/asm/slice_64.h b/arch/powerpc/include/asm/slice_64.h >> index 9d1c97b83010..0959475239c6 100644 >> --- a/arch/powerpc/include/asm/slice_64.h >> +++ b/arch/powerpc/include/asm/slice_64.h >> @@ -2,6 +2,8 @@ >> #ifndef _ASM_POWERPC_SLICE_64_H >> #define _ASM_POWERPC_SLICE_64_H >> >> +#define SLICE_LOW_SHIFT 28 >> + > > You are moving the LOW_SHIFT here, may be you can fix that up in earlier > patch as reviewed there. Yes, done in earlier patch now. > >> #define SLICE_HIGH_SHIFT 40 >> #define SLICE_NUM_HIGH (H_PGTABLE_RANGE >> SLICE_HIGH_SHIFT) >> #define GET_HIGH_SLICE_INDEX(addr) ((addr) >> SLICE_HIGH_SHIFT) >> diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c >> index d6597038931d..8e1566bf82b8 100644 >> --- a/arch/powerpc/kernel/paca.c >> +++ b/arch/powerpc/kernel/paca.c >> @@ -264,7 +264,8 @@ void copy_mm_to_paca(struct mm_struct *mm) >> #ifdef CONFIG_PPC_MM_SLICES >> VM_BUG_ON(!mm->context.slb_addr_limit); >> get_paca()->mm_ctx_slb_addr_limit = mm->context.slb_addr_limit; >> - get_paca()->mm_ctx_low_slices_psize = context->low_slices_psize; >> + memcpy(&get_paca()->mm_ctx_low_slices_psize, >> + &context->low_slices_psize, sizeof(context->low_slices_psize)); >> memcpy(&get_paca()->mm_ctx_high_slices_psize, >> &context->high_slices_psize, TASK_SLICE_ARRAY_SZ(mm)); >> #else /* CONFIG_PPC_MM_SLICES */ >> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c >> index 655a5a9a183d..da696565b969 100644 >> --- a/arch/powerpc/mm/hash_utils_64.c >> +++ b/arch/powerpc/mm/hash_utils_64.c >> @@ -1097,19 +1097,18 @@ unsigned int hash_page_do_lazy_icache(unsigned int pp, pte_t pte, int trap) >> #ifdef CONFIG_PPC_MM_SLICES >> static unsigned int get_paca_psize(unsigned long addr) >> { >> - u64 lpsizes; >> - unsigned char *hpsizes; >> + unsigned char *psizes; >> unsigned long index, mask_index; >> >> if (addr < SLICE_LOW_TOP) { >> - lpsizes = get_paca()->mm_ctx_low_slices_psize; >> + psizes = get_paca()->mm_ctx_low_slices_psize; >> index = GET_LOW_SLICE_INDEX(addr); >> - return (lpsizes >> (index * 4)) & 0xF; >> + } else { >> + psizes = get_paca()->mm_ctx_high_slices_psize; >> + index = GET_HIGH_SLICE_INDEX(addr); >> } >> - hpsizes = get_paca()->mm_ctx_high_slices_psize; >> - index = GET_HIGH_SLICE_INDEX(addr); >> mask_index = index & 0x1; >> - return (hpsizes[index >> 1] >> (mask_index * 4)) & 0xF; >> + return (psizes[index >> 1] >> (mask_index * 4)) & 0xF; >> } >> >> #else >> diff --git a/arch/powerpc/mm/slb_low.S b/arch/powerpc/mm/slb_low.S >> index 2cf5ef3fc50d..2c7c717fd2ea 100644 >> --- a/arch/powerpc/mm/slb_low.S >> +++ b/arch/powerpc/mm/slb_low.S >> @@ -200,10 +200,12 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_1T_SEGMENT) >> 5: >> /* >> * Handle lpsizes >> - * r9 is get_paca()->context.low_slices_psize, r11 is index >> + * r9 is get_paca()->context.low_slices_psize[index], r11 is mask_index >> */ >> - ld r9,PACALOWSLICESPSIZE(r13) >> - mr r11,r10 >> + srdi r11,r10,1 /* index */ >> + addi r9,r11,PACALOWSLICESPSIZE >> + lbzx r9,r13,r9 /* r9 is lpsizes[r11] */ >> + rldicl r11,r10,0,63 /* r11 = r10 & 0x1 */ >> 6: >> sldi r11,r11,2 /* index * 4 */ >> /* Extract the psize and multiply to get an array offset */ >> diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c >> index 549704dfa777..3d573a038d42 100644 >> --- a/arch/powerpc/mm/slice.c >> +++ b/arch/powerpc/mm/slice.c >> @@ -148,18 +148,20 @@ static void slice_mask_for_free(struct mm_struct *mm, struct slice_mask *ret, >> static void slice_mask_for_size(struct mm_struct *mm, int psize, struct slice_mask *ret, >> unsigned long high_limit) >> { >> - unsigned char *hpsizes; >> + unsigned char *hpsizes, *lpsizes; >> int index, mask_index; >> unsigned long i; >> - u64 lpsizes; >> >> ret->low_slices = 0; >> slice_bitmap_zero(ret->high_slices, SLICE_NUM_HIGH); >> >> lpsizes = mm->context.low_slices_psize; >> - for (i = 0; i < SLICE_NUM_LOW; i++) >> - if (((lpsizes >> (i * 4)) & 0xf) == psize) >> + for (i = 0; i < SLICE_NUM_LOW; i++) { >> + mask_index = i & 0x1; >> + index = i >> 1; >> + if (((lpsizes[index] >> (mask_index * 4)) & 0xf) == psize) >> ret->low_slices |= 1u << i; >> + } >> >> if (high_limit <= SLICE_LOW_TOP) >> return; >> @@ -211,8 +213,7 @@ static void slice_convert(struct mm_struct *mm, struct slice_mask mask, int psiz >> { >> int index, mask_index; >> /* Write the new slice psize bits */ >> - unsigned char *hpsizes; >> - u64 lpsizes; >> + unsigned char *hpsizes, *lpsizes; >> unsigned long i, flags; >> >> slice_dbg("slice_convert(mm=%p, psize=%d)\n", mm, psize); >> @@ -225,12 +226,13 @@ static void slice_convert(struct mm_struct *mm, struct slice_mask mask, int psiz >> >> lpsizes = mm->context.low_slices_psize; >> for (i = 0; i < SLICE_NUM_LOW; i++) >> - if (mask.low_slices & (1u << i)) >> - lpsizes = (lpsizes & ~(0xful << (i * 4))) | >> - (((unsigned long)psize) << (i * 4)); >> - >> - /* Assign the value back */ >> - mm->context.low_slices_psize = lpsizes; >> + if (mask.low_slices & (1u << i)) { >> + mask_index = i & 0x1; >> + index = i >> 1; >> + lpsizes[index] = (lpsizes[index] & >> + ~(0xf << (mask_index * 4))) | >> + (((unsigned long)psize) << (mask_index * 4)); >> + } >> >> hpsizes = mm->context.high_slices_psize; >> for (i = 0; i < GET_HIGH_SLICE_INDEX(mm->context.slb_addr_limit); i++) { >> @@ -629,7 +631,7 @@ unsigned long arch_get_unmapped_area_topdown(struct file *filp, >> >> unsigned int get_slice_psize(struct mm_struct *mm, unsigned long addr) >> { >> - unsigned char *hpsizes; >> + unsigned char *psizes; >> int index, mask_index; >> >> /* >> @@ -643,15 +645,14 @@ unsigned int get_slice_psize(struct mm_struct *mm, unsigned long addr) >> #endif >> } >> if (addr < SLICE_LOW_TOP) { >> - u64 lpsizes; >> - lpsizes = mm->context.low_slices_psize; >> + psizes = mm->context.low_slices_psize; >> index = GET_LOW_SLICE_INDEX(addr); >> - return (lpsizes >> (index * 4)) & 0xf; >> + } else { >> + psizes = mm->context.high_slices_psize; >> + index = GET_HIGH_SLICE_INDEX(addr); >> } >> - hpsizes = mm->context.high_slices_psize; >> - index = GET_HIGH_SLICE_INDEX(addr); >> mask_index = index & 0x1; >> - return (hpsizes[index >> 1] >> (mask_index * 4)) & 0xf; >> + return (psizes[index >> 1] >> (mask_index * 4)) & 0xf; >> } >> EXPORT_SYMBOL_GPL(get_slice_psize); >> >> @@ -672,8 +673,8 @@ EXPORT_SYMBOL_GPL(get_slice_psize); >> void slice_set_user_psize(struct mm_struct *mm, unsigned int psize) >> { >> int index, mask_index; >> - unsigned char *hpsizes; >> - unsigned long flags, lpsizes; >> + unsigned char *hpsizes, *lpsizes; >> + unsigned long flags; >> unsigned int old_psize; >> int i; >> >> @@ -691,12 +692,14 @@ void slice_set_user_psize(struct mm_struct *mm, unsigned int psize) >> wmb(); >> >> lpsizes = mm->context.low_slices_psize; >> - for (i = 0; i < SLICE_NUM_LOW; i++) >> - if (((lpsizes >> (i * 4)) & 0xf) == old_psize) >> - lpsizes = (lpsizes & ~(0xful << (i * 4))) | >> - (((unsigned long)psize) << (i * 4)); >> - /* Assign the value back */ >> - mm->context.low_slices_psize = lpsizes; >> + for (i = 0; i < SLICE_NUM_LOW; i++) { >> + mask_index = i & 0x1; >> + index = i >> 1; >> + if (((lpsizes[index] >> (mask_index * 4)) & 0xf) == old_psize) >> + lpsizes[index] = (lpsizes[index] & >> + ~(0xf << (mask_index * 4))) | >> + (((unsigned long)psize) << (mask_index * 4)); >> + } >> >> hpsizes = mm->context.high_slices_psize; >> for (i = 0; i < SLICE_NUM_HIGH; i++) { >> -- >> 2.13.3