Received: by 10.223.176.5 with SMTP id f5csp2617293wra; Mon, 29 Jan 2018 00:57:46 -0800 (PST) X-Google-Smtp-Source: AH8x226l+o3oGYnIupChpl5yzRjdYDNOszPOy9ePhNdAE/cpi4lvn++H9gSD0s/9YK5fZFYhBgH8 X-Received: by 2002:a17:902:5a88:: with SMTP id r8-v6mr1397383pli.289.1517216265986; Mon, 29 Jan 2018 00:57:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517216265; cv=none; d=google.com; s=arc-20160816; b=oi8mn2poA6VMkSp1iZaA2ynE3lLBMS+EqrXLz+46gw76Tg0aVI1gqgptcWtq3wsCdy MYamPqtQuTLk+XD/v/fDyoL+GjDYoSBWMq5Pij3I1eQrjVcxH3ymEUXXS22pga9EkzWi hdhxhfJrounx4REgGPYJckAP6NYOYyGkB/aRLv3BdqcGpI0H0AkfAq4wjG13JNdMjFi9 mqWzv8t8PVsEDGmwIS70PM2Ev0Wie1X7z5Aenjy7pJi480wWZdmLBZhAEfsXQAm7CRus 6lpN0qrO1NvX3s55/zBLVK5unRC7FUvV18Z0AQ/NVklztgPakNDMLFmCjiRiL5jhmcIM Svnw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=OmY+DxQmkY/2T6BMgj8Mqo8c+fFgUmhKCcBx2Rkp4Ac=; b=zllZC+8KY221Qr2XnJWPm9qy2HjR4IjJsi3aMCNM+NO9uD4VJappcmuJfnEQcfKONM 0gcRBOP/oOgOGKSOcu82FwucXy2P7nlh+k1SS1SU4Is3ZwJMhGr1BM3B0AXFXMj0fvHv 8cUGInuZpIS9djBafqqm/hWnTl1KkTPMe4IS593u8iq+yIzCGOO0UgIAZxXUGMkCo9BM N8g1mjvw7nNYJ/ncQOuUaeNicEurs0Ow8AjQg75EQcKFCiMx3y2T6Q4D8vg4iqg4adbp BKAnGCwQ6txJSJzrCxD7LobeZld7xeTVDIgL9aImPzXD06N/PGEXC19LL9zPTbBuCxqb qHEA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h22si11501738pfa.14.2018.01.29.00.57.31; Mon, 29 Jan 2018 00:57:45 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751573AbeA2I44 (ORCPT + 99 others); Mon, 29 Jan 2018 03:56:56 -0500 Received: from pegase1.c-s.fr ([93.17.236.30]:54272 "EHLO pegase1.c-s.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751273AbeA2I4z (ORCPT ); Mon, 29 Jan 2018 03:56:55 -0500 Received: from localhost (mailhub1-int [192.168.12.234]) by localhost (Postfix) with ESMTP id 3zVNfq1lnyz9txbm; Mon, 29 Jan 2018 09:56:51 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [192.168.12.234]) (amavisd-new, port 10024) with ESMTP id 7Amx-D-XjVRk; Mon, 29 Jan 2018 09:56:50 +0100 (CET) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 3zVNfh3G2sz9ty7g; Mon, 29 Jan 2018 09:56:44 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 2D2E18B823; Mon, 29 Jan 2018 09:56:44 +0100 (CET) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id Y1NHYh1lA3p1; Mon, 29 Jan 2018 09:56:44 +0100 (CET) Received: from PO15451 (po15451.idsi0.si.c-s.fr [172.25.231.40]) by messagerie.si.c-s.fr (Postfix) with ESMTP id A22BB8B80F; Mon, 29 Jan 2018 09:56:43 +0100 (CET) Subject: Re: [PATCH v3 4/5] powerpc/mm: Allow up to 64 low slices To: "Aneesh Kumar K.V" , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Scott Wood Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org References: <6920f6efe2dcdabf59350b2d31ee6bd4bdef57f4.1516783089.git.christophe.leroy@c-s.fr> <5dfafb3f0e2438e43f44917ffcf70e3daa4f37ee.1516783089.git.christophe.leroy@c-s.fr> <87po5t18ll.fsf@linux.vnet.ibm.com> From: Christophe LEROY Message-ID: <1a25b9e3-582d-207c-a2cd-fd33ee5e5df0@c-s.fr> Date: Mon, 29 Jan 2018 09:56:43 +0100 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: <87po5t18ll.fsf@linux.vnet.ibm.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: fr Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Le 29/01/2018 à 07:29, Aneesh Kumar K.V a écrit : > Christophe Leroy writes: > >> While the implementation of the "slices" address space allows >> a significant amount of high slices, it limits the number of >> low slices to 16 due to the use of a single u64 low_slices_psize >> element in struct mm_context_t >> >> On the 8xx, the minimum slice size is the size of the area >> covered by a single PMD entry, ie 4M in 4K pages mode and 64M in >> 16K pages mode. This means we could have at least 64 slices. >> >> In order to override this limitation, this patch switches the >> handling of low_slices_psize to char array as done already for >> high_slices_psize. This allows to increase the number of low >> slices to 64 on the 8xx. >> > > Maybe update the subject to "make low slice also a bitmap".Also indicate > that the bitmap functions optimize the operation if the bitmapsize les > <= long ? v3 doesn't use bitmap functions anymore for low_slices. In this version, only low_slices_psize has been reworked to allow up to 64 slices instead of 16. I have kept low_slices as is (ie as a u64), hence allowing up to 64 slices, which is big enough. > > Also switch the 8xx to higher value in the another patch? One separate patch just for changing the value of SLICE_LOW_SHIFT from 28 to 26 on the 8xx ? > >> Signed-off-by: Christophe Leroy >> --- >> v2: Usign slice_bitmap_xxx() macros instead of bitmap_xxx() functions. >> v3: keep low_slices as a u64, this allows 64 slices which is enough. >> >> arch/powerpc/include/asm/book3s/64/mmu.h | 3 +- >> arch/powerpc/include/asm/mmu-8xx.h | 7 +++- >> arch/powerpc/include/asm/paca.h | 2 +- >> arch/powerpc/include/asm/slice.h | 1 - >> arch/powerpc/include/asm/slice_32.h | 2 ++ >> arch/powerpc/include/asm/slice_64.h | 2 ++ >> arch/powerpc/kernel/paca.c | 3 +- >> arch/powerpc/mm/hash_utils_64.c | 13 ++++---- >> arch/powerpc/mm/slb_low.S | 8 +++-- >> arch/powerpc/mm/slice.c | 57 +++++++++++++++++--------------- >> 10 files changed, 56 insertions(+), 42 deletions(-) >> >> diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h >> index c9448e19847a..b076a2d74c69 100644 >> --- a/arch/powerpc/include/asm/book3s/64/mmu.h >> +++ b/arch/powerpc/include/asm/book3s/64/mmu.h >> @@ -91,7 +91,8 @@ typedef struct { >> struct npu_context *npu_context; >> >> #ifdef CONFIG_PPC_MM_SLICES >> - u64 low_slices_psize; /* SLB page size encodings */ >> + /* SLB page size encodings*/ >> + unsigned char low_slices_psize[BITS_PER_LONG / BITS_PER_BYTE]; >> unsigned char high_slices_psize[SLICE_ARRAY_SIZE]; >> unsigned long slb_addr_limit; >> #else >> diff --git a/arch/powerpc/include/asm/mmu-8xx.h b/arch/powerpc/include/asm/mmu-8xx.h >> index 5f89b6010453..5f37ba06b56c 100644 >> --- a/arch/powerpc/include/asm/mmu-8xx.h >> +++ b/arch/powerpc/include/asm/mmu-8xx.h >> @@ -164,6 +164,11 @@ >> */ >> #define SPRN_M_TW 799 >> >> +#ifdef CONFIG_PPC_MM_SLICES >> +#include >> +#define SLICE_ARRAY_SIZE (1 << (32 - SLICE_LOW_SHIFT - 1)) >> +#endif >> + >> #ifndef __ASSEMBLY__ >> typedef struct { >> unsigned int id; >> @@ -171,7 +176,7 @@ typedef struct { >> unsigned long vdso_base; >> #ifdef CONFIG_PPC_MM_SLICES >> u16 user_psize; /* page size index */ >> - u64 low_slices_psize; /* page size encodings */ >> + unsigned char low_slices_psize[SLICE_ARRAY_SIZE]; >> unsigned char high_slices_psize[0]; >> unsigned long slb_addr_limit; >> #endif >> diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h >> index 23ac7fc0af23..a3e531fe9ac7 100644 >> --- a/arch/powerpc/include/asm/paca.h >> +++ b/arch/powerpc/include/asm/paca.h >> @@ -141,7 +141,7 @@ struct paca_struct { >> #ifdef CONFIG_PPC_BOOK3S >> mm_context_id_t mm_ctx_id; >> #ifdef CONFIG_PPC_MM_SLICES >> - u64 mm_ctx_low_slices_psize; >> + unsigned char mm_ctx_low_slices_psize[BITS_PER_LONG / BITS_PER_BYTE]; >> unsigned char mm_ctx_high_slices_psize[SLICE_ARRAY_SIZE]; >> unsigned long mm_ctx_slb_addr_limit; >> #else >> diff --git a/arch/powerpc/include/asm/slice.h b/arch/powerpc/include/asm/slice.h >> index 2b4b70de7e71..b67ba8faa507 100644 >> --- a/arch/powerpc/include/asm/slice.h >> +++ b/arch/powerpc/include/asm/slice.h >> @@ -16,7 +16,6 @@ >> #define HAVE_ARCH_UNMAPPED_AREA >> #define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN >> >> -#define SLICE_LOW_SHIFT 28 >> #define SLICE_LOW_TOP (0x100000000ull) >> #define SLICE_NUM_LOW (SLICE_LOW_TOP >> SLICE_LOW_SHIFT) >> #define GET_LOW_SLICE_INDEX(addr) ((addr) >> SLICE_LOW_SHIFT) >> diff --git a/arch/powerpc/include/asm/slice_32.h b/arch/powerpc/include/asm/slice_32.h >> index 7e27c0dfb913..349187c20100 100644 >> --- a/arch/powerpc/include/asm/slice_32.h >> +++ b/arch/powerpc/include/asm/slice_32.h >> @@ -2,6 +2,8 @@ >> #ifndef _ASM_POWERPC_SLICE_32_H >> #define _ASM_POWERPC_SLICE_32_H >> >> +#define SLICE_LOW_SHIFT 26 /* 64 slices */ >> + >> #define SLICE_HIGH_SHIFT 0 >> #define SLICE_NUM_HIGH 0ul >> #define GET_HIGH_SLICE_INDEX(addr) (addr & 0) >> diff --git a/arch/powerpc/include/asm/slice_64.h b/arch/powerpc/include/asm/slice_64.h >> index 9d1c97b83010..0959475239c6 100644 >> --- a/arch/powerpc/include/asm/slice_64.h >> +++ b/arch/powerpc/include/asm/slice_64.h >> @@ -2,6 +2,8 @@ >> #ifndef _ASM_POWERPC_SLICE_64_H >> #define _ASM_POWERPC_SLICE_64_H >> >> +#define SLICE_LOW_SHIFT 28 >> + > > You are moving the LOW_SHIFT here, may be you can fix that up in earlier > patch as reviewed there. Ok Christophe > >> #define SLICE_HIGH_SHIFT 40 >> #define SLICE_NUM_HIGH (H_PGTABLE_RANGE >> SLICE_HIGH_SHIFT) >> #define GET_HIGH_SLICE_INDEX(addr) ((addr) >> SLICE_HIGH_SHIFT) >> diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c >> index d6597038931d..8e1566bf82b8 100644 >> --- a/arch/powerpc/kernel/paca.c >> +++ b/arch/powerpc/kernel/paca.c >> @@ -264,7 +264,8 @@ void copy_mm_to_paca(struct mm_struct *mm) >> #ifdef CONFIG_PPC_MM_SLICES >> VM_BUG_ON(!mm->context.slb_addr_limit); >> get_paca()->mm_ctx_slb_addr_limit = mm->context.slb_addr_limit; >> - get_paca()->mm_ctx_low_slices_psize = context->low_slices_psize; >> + memcpy(&get_paca()->mm_ctx_low_slices_psize, >> + &context->low_slices_psize, sizeof(context->low_slices_psize)); >> memcpy(&get_paca()->mm_ctx_high_slices_psize, >> &context->high_slices_psize, TASK_SLICE_ARRAY_SZ(mm)); >> #else /* CONFIG_PPC_MM_SLICES */ >> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c >> index 655a5a9a183d..da696565b969 100644 >> --- a/arch/powerpc/mm/hash_utils_64.c >> +++ b/arch/powerpc/mm/hash_utils_64.c >> @@ -1097,19 +1097,18 @@ unsigned int hash_page_do_lazy_icache(unsigned int pp, pte_t pte, int trap) >> #ifdef CONFIG_PPC_MM_SLICES >> static unsigned int get_paca_psize(unsigned long addr) >> { >> - u64 lpsizes; >> - unsigned char *hpsizes; >> + unsigned char *psizes; >> unsigned long index, mask_index; >> >> if (addr < SLICE_LOW_TOP) { >> - lpsizes = get_paca()->mm_ctx_low_slices_psize; >> + psizes = get_paca()->mm_ctx_low_slices_psize; >> index = GET_LOW_SLICE_INDEX(addr); >> - return (lpsizes >> (index * 4)) & 0xF; >> + } else { >> + psizes = get_paca()->mm_ctx_high_slices_psize; >> + index = GET_HIGH_SLICE_INDEX(addr); >> } >> - hpsizes = get_paca()->mm_ctx_high_slices_psize; >> - index = GET_HIGH_SLICE_INDEX(addr); >> mask_index = index & 0x1; >> - return (hpsizes[index >> 1] >> (mask_index * 4)) & 0xF; >> + return (psizes[index >> 1] >> (mask_index * 4)) & 0xF; >> } >> >> #else >> diff --git a/arch/powerpc/mm/slb_low.S b/arch/powerpc/mm/slb_low.S >> index 2cf5ef3fc50d..2c7c717fd2ea 100644 >> --- a/arch/powerpc/mm/slb_low.S >> +++ b/arch/powerpc/mm/slb_low.S >> @@ -200,10 +200,12 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_1T_SEGMENT) >> 5: >> /* >> * Handle lpsizes >> - * r9 is get_paca()->context.low_slices_psize, r11 is index >> + * r9 is get_paca()->context.low_slices_psize[index], r11 is mask_index >> */ >> - ld r9,PACALOWSLICESPSIZE(r13) >> - mr r11,r10 >> + srdi r11,r10,1 /* index */ >> + addi r9,r11,PACALOWSLICESPSIZE >> + lbzx r9,r13,r9 /* r9 is lpsizes[r11] */ >> + rldicl r11,r10,0,63 /* r11 = r10 & 0x1 */ >> 6: >> sldi r11,r11,2 /* index * 4 */ >> /* Extract the psize and multiply to get an array offset */ >> diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c >> index 549704dfa777..3d573a038d42 100644 >> --- a/arch/powerpc/mm/slice.c >> +++ b/arch/powerpc/mm/slice.c >> @@ -148,18 +148,20 @@ static void slice_mask_for_free(struct mm_struct *mm, struct slice_mask *ret, >> static void slice_mask_for_size(struct mm_struct *mm, int psize, struct slice_mask *ret, >> unsigned long high_limit) >> { >> - unsigned char *hpsizes; >> + unsigned char *hpsizes, *lpsizes; >> int index, mask_index; >> unsigned long i; >> - u64 lpsizes; >> >> ret->low_slices = 0; >> slice_bitmap_zero(ret->high_slices, SLICE_NUM_HIGH); >> >> lpsizes = mm->context.low_slices_psize; >> - for (i = 0; i < SLICE_NUM_LOW; i++) >> - if (((lpsizes >> (i * 4)) & 0xf) == psize) >> + for (i = 0; i < SLICE_NUM_LOW; i++) { >> + mask_index = i & 0x1; >> + index = i >> 1; >> + if (((lpsizes[index] >> (mask_index * 4)) & 0xf) == psize) >> ret->low_slices |= 1u << i; >> + } >> >> if (high_limit <= SLICE_LOW_TOP) >> return; >> @@ -211,8 +213,7 @@ static void slice_convert(struct mm_struct *mm, struct slice_mask mask, int psiz >> { >> int index, mask_index; >> /* Write the new slice psize bits */ >> - unsigned char *hpsizes; >> - u64 lpsizes; >> + unsigned char *hpsizes, *lpsizes; >> unsigned long i, flags; >> >> slice_dbg("slice_convert(mm=%p, psize=%d)\n", mm, psize); >> @@ -225,12 +226,13 @@ static void slice_convert(struct mm_struct *mm, struct slice_mask mask, int psiz >> >> lpsizes = mm->context.low_slices_psize; >> for (i = 0; i < SLICE_NUM_LOW; i++) >> - if (mask.low_slices & (1u << i)) >> - lpsizes = (lpsizes & ~(0xful << (i * 4))) | >> - (((unsigned long)psize) << (i * 4)); >> - >> - /* Assign the value back */ >> - mm->context.low_slices_psize = lpsizes; >> + if (mask.low_slices & (1u << i)) { >> + mask_index = i & 0x1; >> + index = i >> 1; >> + lpsizes[index] = (lpsizes[index] & >> + ~(0xf << (mask_index * 4))) | >> + (((unsigned long)psize) << (mask_index * 4)); >> + } >> >> hpsizes = mm->context.high_slices_psize; >> for (i = 0; i < GET_HIGH_SLICE_INDEX(mm->context.slb_addr_limit); i++) { >> @@ -629,7 +631,7 @@ unsigned long arch_get_unmapped_area_topdown(struct file *filp, >> >> unsigned int get_slice_psize(struct mm_struct *mm, unsigned long addr) >> { >> - unsigned char *hpsizes; >> + unsigned char *psizes; >> int index, mask_index; >> >> /* >> @@ -643,15 +645,14 @@ unsigned int get_slice_psize(struct mm_struct *mm, unsigned long addr) >> #endif >> } >> if (addr < SLICE_LOW_TOP) { >> - u64 lpsizes; >> - lpsizes = mm->context.low_slices_psize; >> + psizes = mm->context.low_slices_psize; >> index = GET_LOW_SLICE_INDEX(addr); >> - return (lpsizes >> (index * 4)) & 0xf; >> + } else { >> + psizes = mm->context.high_slices_psize; >> + index = GET_HIGH_SLICE_INDEX(addr); >> } >> - hpsizes = mm->context.high_slices_psize; >> - index = GET_HIGH_SLICE_INDEX(addr); >> mask_index = index & 0x1; >> - return (hpsizes[index >> 1] >> (mask_index * 4)) & 0xf; >> + return (psizes[index >> 1] >> (mask_index * 4)) & 0xf; >> } >> EXPORT_SYMBOL_GPL(get_slice_psize); >> >> @@ -672,8 +673,8 @@ EXPORT_SYMBOL_GPL(get_slice_psize); >> void slice_set_user_psize(struct mm_struct *mm, unsigned int psize) >> { >> int index, mask_index; >> - unsigned char *hpsizes; >> - unsigned long flags, lpsizes; >> + unsigned char *hpsizes, *lpsizes; >> + unsigned long flags; >> unsigned int old_psize; >> int i; >> >> @@ -691,12 +692,14 @@ void slice_set_user_psize(struct mm_struct *mm, unsigned int psize) >> wmb(); >> >> lpsizes = mm->context.low_slices_psize; >> - for (i = 0; i < SLICE_NUM_LOW; i++) >> - if (((lpsizes >> (i * 4)) & 0xf) == old_psize) >> - lpsizes = (lpsizes & ~(0xful << (i * 4))) | >> - (((unsigned long)psize) << (i * 4)); >> - /* Assign the value back */ >> - mm->context.low_slices_psize = lpsizes; >> + for (i = 0; i < SLICE_NUM_LOW; i++) { >> + mask_index = i & 0x1; >> + index = i >> 1; >> + if (((lpsizes[index] >> (mask_index * 4)) & 0xf) == old_psize) >> + lpsizes[index] = (lpsizes[index] & >> + ~(0xf << (mask_index * 4))) | >> + (((unsigned long)psize) << (mask_index * 4)); >> + } >> >> hpsizes = mm->context.high_slices_psize; >> for (i = 0; i < SLICE_NUM_HIGH; i++) { >> -- >> 2.13.3