Received: by 10.223.176.5 with SMTP id f5csp2503395wra; Sun, 28 Jan 2018 22:30:28 -0800 (PST) X-Google-Smtp-Source: AH8x227N7M2UdWcCwnFtdZeAY2cbSJHOAmotwgFuW7wBduFxBBnpUo524xVIU85B7ns/gGK5ipNi X-Received: by 10.98.138.21 with SMTP id y21mr26046047pfd.147.1517207428735; Sun, 28 Jan 2018 22:30:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517207428; cv=none; d=google.com; s=arc-20160816; b=aDRypyM6m7/5RXzhIG5OP8A7UnFYgK1v8PJT/1xfRRIhkr4W8d5scvzsS4IPJ4cWkM osiR47ZkGiqrNW9NIHTsjOnaSqnLrWvVwxaTixXAQZYYrjcnyaFb248DejWgQCRpaK3C u7FBprCgWPHVWwfm+GLOvQH9mg/5FqsZEShHGsWdc9Jfp/GnXBcFKfsHSa6fOcwV7n2u RQeBWsw12bCKyyh/HCb1sxWJbpIyUvqg5oB2kMyUi8DndRYlfLSONlg1S4rWbMoqrFfa jLb0PiULd2kcszkoWynB4nP6j91dQwZwSPmm+VKfCZPC94ZvjFW56J31DVhiLR3LGGiH IPDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:mime-version:date:references :in-reply-to:subject:cc:to:from:arc-authentication-results; bh=1T2keiMQxEWWlL2F2YL04gS1pIYGX76VrBjJef7RpB0=; b=hsHXamQ70NOKO/BuDsXATX6ap24qiST9UgKSudenmAO3Lv+aH5S1u6eMR/Ggp9i0pk X7P0XWsnzJA79vyvjtIym38XoFJgoRdm+FiiMkLSe2hNtiktWJvfoIokPlwJwr+vEonW 1B9KOhkStMHYBRyMM7cBpjJqghzzghKaUwwlNQG5/NShHhetKvwmAKLzm3dQYRypM8LR 9Ny/Fq3GLaunkbME+SK98p84C1MjhNQVQC3W6mnhSNKbmVpUQhHVkYu+qcelp5tAfuxy PQMOM9m34hGECTFRTuVCC8nHCRtvzX3n6lCq0mlMVbqp9BQapjMHSqqSlLTyUmMGn86H fR5g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f90-v6si671766plf.0.2018.01.28.22.30.14; Sun, 28 Jan 2018 22:30:28 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751264AbeA2G3V (ORCPT + 99 others); Mon, 29 Jan 2018 01:29:21 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:50120 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751032AbeA2G3U (ORCPT ); Mon, 29 Jan 2018 01:29:20 -0500 Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w0T6T8wg008613 for ; Mon, 29 Jan 2018 01:29:19 -0500 Received: from e06smtp11.uk.ibm.com (e06smtp11.uk.ibm.com [195.75.94.107]) by mx0a-001b2d01.pphosted.com with ESMTP id 2fsud7nqag-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Mon, 29 Jan 2018 01:29:18 -0500 Received: from localhost by e06smtp11.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 29 Jan 2018 06:29:16 -0000 Received: from b06cxnps3075.portsmouth.uk.ibm.com (9.149.109.195) by e06smtp11.uk.ibm.com (192.168.101.141) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 29 Jan 2018 06:29:13 -0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w0T6TDNp58916890; Mon, 29 Jan 2018 06:29:13 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7E7EFA4055; Mon, 29 Jan 2018 06:22:47 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B48DBA4053; Mon, 29 Jan 2018 06:22:45 +0000 (GMT) Received: from skywalker (unknown [9.199.61.239]) by d06av23.portsmouth.uk.ibm.com (Postfix) with SMTP; Mon, 29 Jan 2018 06:22:45 +0000 (GMT) Received: (nullmailer pid 19995 invoked by uid 1000); Mon, 29 Jan 2018 06:29:10 -0000 From: "Aneesh Kumar K.V" To: Christophe Leroy , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Scott Wood Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org Subject: Re: [PATCH v3 4/5] powerpc/mm: Allow up to 64 low slices In-Reply-To: <5dfafb3f0e2438e43f44917ffcf70e3daa4f37ee.1516783089.git.christophe.leroy@c-s.fr> References: <6920f6efe2dcdabf59350b2d31ee6bd4bdef57f4.1516783089.git.christophe.leroy@c-s.fr> <5dfafb3f0e2438e43f44917ffcf70e3daa4f37ee.1516783089.git.christophe.leroy@c-s.fr> Date: Mon, 29 Jan 2018 11:59:10 +0530 MIME-Version: 1.0 Content-Type: text/plain X-TM-AS-GCONF: 00 x-cbid: 18012906-0040-0000-0000-00000429D3B5 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18012906-0041-0000-0000-000020CD66A5 Message-Id: <87po5t18ll.fsf@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2018-01-29_04:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=5 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1801290089 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Christophe Leroy writes: > While the implementation of the "slices" address space allows > a significant amount of high slices, it limits the number of > low slices to 16 due to the use of a single u64 low_slices_psize > element in struct mm_context_t > > On the 8xx, the minimum slice size is the size of the area > covered by a single PMD entry, ie 4M in 4K pages mode and 64M in > 16K pages mode. This means we could have at least 64 slices. > > In order to override this limitation, this patch switches the > handling of low_slices_psize to char array as done already for > high_slices_psize. This allows to increase the number of low > slices to 64 on the 8xx. > Maybe update the subject to "make low slice also a bitmap".Also indicate that the bitmap functions optimize the operation if the bitmapsize les <= long ? Also switch the 8xx to higher value in the another patch? > Signed-off-by: Christophe Leroy > --- > v2: Usign slice_bitmap_xxx() macros instead of bitmap_xxx() functions. > v3: keep low_slices as a u64, this allows 64 slices which is enough. > > arch/powerpc/include/asm/book3s/64/mmu.h | 3 +- > arch/powerpc/include/asm/mmu-8xx.h | 7 +++- > arch/powerpc/include/asm/paca.h | 2 +- > arch/powerpc/include/asm/slice.h | 1 - > arch/powerpc/include/asm/slice_32.h | 2 ++ > arch/powerpc/include/asm/slice_64.h | 2 ++ > arch/powerpc/kernel/paca.c | 3 +- > arch/powerpc/mm/hash_utils_64.c | 13 ++++---- > arch/powerpc/mm/slb_low.S | 8 +++-- > arch/powerpc/mm/slice.c | 57 +++++++++++++++++--------------- > 10 files changed, 56 insertions(+), 42 deletions(-) > > diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h > index c9448e19847a..b076a2d74c69 100644 > --- a/arch/powerpc/include/asm/book3s/64/mmu.h > +++ b/arch/powerpc/include/asm/book3s/64/mmu.h > @@ -91,7 +91,8 @@ typedef struct { > struct npu_context *npu_context; > > #ifdef CONFIG_PPC_MM_SLICES > - u64 low_slices_psize; /* SLB page size encodings */ > + /* SLB page size encodings*/ > + unsigned char low_slices_psize[BITS_PER_LONG / BITS_PER_BYTE]; > unsigned char high_slices_psize[SLICE_ARRAY_SIZE]; > unsigned long slb_addr_limit; > #else > diff --git a/arch/powerpc/include/asm/mmu-8xx.h b/arch/powerpc/include/asm/mmu-8xx.h > index 5f89b6010453..5f37ba06b56c 100644 > --- a/arch/powerpc/include/asm/mmu-8xx.h > +++ b/arch/powerpc/include/asm/mmu-8xx.h > @@ -164,6 +164,11 @@ > */ > #define SPRN_M_TW 799 > > +#ifdef CONFIG_PPC_MM_SLICES > +#include > +#define SLICE_ARRAY_SIZE (1 << (32 - SLICE_LOW_SHIFT - 1)) > +#endif > + > #ifndef __ASSEMBLY__ > typedef struct { > unsigned int id; > @@ -171,7 +176,7 @@ typedef struct { > unsigned long vdso_base; > #ifdef CONFIG_PPC_MM_SLICES > u16 user_psize; /* page size index */ > - u64 low_slices_psize; /* page size encodings */ > + unsigned char low_slices_psize[SLICE_ARRAY_SIZE]; > unsigned char high_slices_psize[0]; > unsigned long slb_addr_limit; > #endif > diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h > index 23ac7fc0af23..a3e531fe9ac7 100644 > --- a/arch/powerpc/include/asm/paca.h > +++ b/arch/powerpc/include/asm/paca.h > @@ -141,7 +141,7 @@ struct paca_struct { > #ifdef CONFIG_PPC_BOOK3S > mm_context_id_t mm_ctx_id; > #ifdef CONFIG_PPC_MM_SLICES > - u64 mm_ctx_low_slices_psize; > + unsigned char mm_ctx_low_slices_psize[BITS_PER_LONG / BITS_PER_BYTE]; > unsigned char mm_ctx_high_slices_psize[SLICE_ARRAY_SIZE]; > unsigned long mm_ctx_slb_addr_limit; > #else > diff --git a/arch/powerpc/include/asm/slice.h b/arch/powerpc/include/asm/slice.h > index 2b4b70de7e71..b67ba8faa507 100644 > --- a/arch/powerpc/include/asm/slice.h > +++ b/arch/powerpc/include/asm/slice.h > @@ -16,7 +16,6 @@ > #define HAVE_ARCH_UNMAPPED_AREA > #define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN > > -#define SLICE_LOW_SHIFT 28 > #define SLICE_LOW_TOP (0x100000000ull) > #define SLICE_NUM_LOW (SLICE_LOW_TOP >> SLICE_LOW_SHIFT) > #define GET_LOW_SLICE_INDEX(addr) ((addr) >> SLICE_LOW_SHIFT) > diff --git a/arch/powerpc/include/asm/slice_32.h b/arch/powerpc/include/asm/slice_32.h > index 7e27c0dfb913..349187c20100 100644 > --- a/arch/powerpc/include/asm/slice_32.h > +++ b/arch/powerpc/include/asm/slice_32.h > @@ -2,6 +2,8 @@ > #ifndef _ASM_POWERPC_SLICE_32_H > #define _ASM_POWERPC_SLICE_32_H > > +#define SLICE_LOW_SHIFT 26 /* 64 slices */ > + > #define SLICE_HIGH_SHIFT 0 > #define SLICE_NUM_HIGH 0ul > #define GET_HIGH_SLICE_INDEX(addr) (addr & 0) > diff --git a/arch/powerpc/include/asm/slice_64.h b/arch/powerpc/include/asm/slice_64.h > index 9d1c97b83010..0959475239c6 100644 > --- a/arch/powerpc/include/asm/slice_64.h > +++ b/arch/powerpc/include/asm/slice_64.h > @@ -2,6 +2,8 @@ > #ifndef _ASM_POWERPC_SLICE_64_H > #define _ASM_POWERPC_SLICE_64_H > > +#define SLICE_LOW_SHIFT 28 > + You are moving the LOW_SHIFT here, may be you can fix that up in earlier patch as reviewed there. > #define SLICE_HIGH_SHIFT 40 > #define SLICE_NUM_HIGH (H_PGTABLE_RANGE >> SLICE_HIGH_SHIFT) > #define GET_HIGH_SLICE_INDEX(addr) ((addr) >> SLICE_HIGH_SHIFT) > diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c > index d6597038931d..8e1566bf82b8 100644 > --- a/arch/powerpc/kernel/paca.c > +++ b/arch/powerpc/kernel/paca.c > @@ -264,7 +264,8 @@ void copy_mm_to_paca(struct mm_struct *mm) > #ifdef CONFIG_PPC_MM_SLICES > VM_BUG_ON(!mm->context.slb_addr_limit); > get_paca()->mm_ctx_slb_addr_limit = mm->context.slb_addr_limit; > - get_paca()->mm_ctx_low_slices_psize = context->low_slices_psize; > + memcpy(&get_paca()->mm_ctx_low_slices_psize, > + &context->low_slices_psize, sizeof(context->low_slices_psize)); > memcpy(&get_paca()->mm_ctx_high_slices_psize, > &context->high_slices_psize, TASK_SLICE_ARRAY_SZ(mm)); > #else /* CONFIG_PPC_MM_SLICES */ > diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c > index 655a5a9a183d..da696565b969 100644 > --- a/arch/powerpc/mm/hash_utils_64.c > +++ b/arch/powerpc/mm/hash_utils_64.c > @@ -1097,19 +1097,18 @@ unsigned int hash_page_do_lazy_icache(unsigned int pp, pte_t pte, int trap) > #ifdef CONFIG_PPC_MM_SLICES > static unsigned int get_paca_psize(unsigned long addr) > { > - u64 lpsizes; > - unsigned char *hpsizes; > + unsigned char *psizes; > unsigned long index, mask_index; > > if (addr < SLICE_LOW_TOP) { > - lpsizes = get_paca()->mm_ctx_low_slices_psize; > + psizes = get_paca()->mm_ctx_low_slices_psize; > index = GET_LOW_SLICE_INDEX(addr); > - return (lpsizes >> (index * 4)) & 0xF; > + } else { > + psizes = get_paca()->mm_ctx_high_slices_psize; > + index = GET_HIGH_SLICE_INDEX(addr); > } > - hpsizes = get_paca()->mm_ctx_high_slices_psize; > - index = GET_HIGH_SLICE_INDEX(addr); > mask_index = index & 0x1; > - return (hpsizes[index >> 1] >> (mask_index * 4)) & 0xF; > + return (psizes[index >> 1] >> (mask_index * 4)) & 0xF; > } > > #else > diff --git a/arch/powerpc/mm/slb_low.S b/arch/powerpc/mm/slb_low.S > index 2cf5ef3fc50d..2c7c717fd2ea 100644 > --- a/arch/powerpc/mm/slb_low.S > +++ b/arch/powerpc/mm/slb_low.S > @@ -200,10 +200,12 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_1T_SEGMENT) > 5: > /* > * Handle lpsizes > - * r9 is get_paca()->context.low_slices_psize, r11 is index > + * r9 is get_paca()->context.low_slices_psize[index], r11 is mask_index > */ > - ld r9,PACALOWSLICESPSIZE(r13) > - mr r11,r10 > + srdi r11,r10,1 /* index */ > + addi r9,r11,PACALOWSLICESPSIZE > + lbzx r9,r13,r9 /* r9 is lpsizes[r11] */ > + rldicl r11,r10,0,63 /* r11 = r10 & 0x1 */ > 6: > sldi r11,r11,2 /* index * 4 */ > /* Extract the psize and multiply to get an array offset */ > diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c > index 549704dfa777..3d573a038d42 100644 > --- a/arch/powerpc/mm/slice.c > +++ b/arch/powerpc/mm/slice.c > @@ -148,18 +148,20 @@ static void slice_mask_for_free(struct mm_struct *mm, struct slice_mask *ret, > static void slice_mask_for_size(struct mm_struct *mm, int psize, struct slice_mask *ret, > unsigned long high_limit) > { > - unsigned char *hpsizes; > + unsigned char *hpsizes, *lpsizes; > int index, mask_index; > unsigned long i; > - u64 lpsizes; > > ret->low_slices = 0; > slice_bitmap_zero(ret->high_slices, SLICE_NUM_HIGH); > > lpsizes = mm->context.low_slices_psize; > - for (i = 0; i < SLICE_NUM_LOW; i++) > - if (((lpsizes >> (i * 4)) & 0xf) == psize) > + for (i = 0; i < SLICE_NUM_LOW; i++) { > + mask_index = i & 0x1; > + index = i >> 1; > + if (((lpsizes[index] >> (mask_index * 4)) & 0xf) == psize) > ret->low_slices |= 1u << i; > + } > > if (high_limit <= SLICE_LOW_TOP) > return; > @@ -211,8 +213,7 @@ static void slice_convert(struct mm_struct *mm, struct slice_mask mask, int psiz > { > int index, mask_index; > /* Write the new slice psize bits */ > - unsigned char *hpsizes; > - u64 lpsizes; > + unsigned char *hpsizes, *lpsizes; > unsigned long i, flags; > > slice_dbg("slice_convert(mm=%p, psize=%d)\n", mm, psize); > @@ -225,12 +226,13 @@ static void slice_convert(struct mm_struct *mm, struct slice_mask mask, int psiz > > lpsizes = mm->context.low_slices_psize; > for (i = 0; i < SLICE_NUM_LOW; i++) > - if (mask.low_slices & (1u << i)) > - lpsizes = (lpsizes & ~(0xful << (i * 4))) | > - (((unsigned long)psize) << (i * 4)); > - > - /* Assign the value back */ > - mm->context.low_slices_psize = lpsizes; > + if (mask.low_slices & (1u << i)) { > + mask_index = i & 0x1; > + index = i >> 1; > + lpsizes[index] = (lpsizes[index] & > + ~(0xf << (mask_index * 4))) | > + (((unsigned long)psize) << (mask_index * 4)); > + } > > hpsizes = mm->context.high_slices_psize; > for (i = 0; i < GET_HIGH_SLICE_INDEX(mm->context.slb_addr_limit); i++) { > @@ -629,7 +631,7 @@ unsigned long arch_get_unmapped_area_topdown(struct file *filp, > > unsigned int get_slice_psize(struct mm_struct *mm, unsigned long addr) > { > - unsigned char *hpsizes; > + unsigned char *psizes; > int index, mask_index; > > /* > @@ -643,15 +645,14 @@ unsigned int get_slice_psize(struct mm_struct *mm, unsigned long addr) > #endif > } > if (addr < SLICE_LOW_TOP) { > - u64 lpsizes; > - lpsizes = mm->context.low_slices_psize; > + psizes = mm->context.low_slices_psize; > index = GET_LOW_SLICE_INDEX(addr); > - return (lpsizes >> (index * 4)) & 0xf; > + } else { > + psizes = mm->context.high_slices_psize; > + index = GET_HIGH_SLICE_INDEX(addr); > } > - hpsizes = mm->context.high_slices_psize; > - index = GET_HIGH_SLICE_INDEX(addr); > mask_index = index & 0x1; > - return (hpsizes[index >> 1] >> (mask_index * 4)) & 0xf; > + return (psizes[index >> 1] >> (mask_index * 4)) & 0xf; > } > EXPORT_SYMBOL_GPL(get_slice_psize); > > @@ -672,8 +673,8 @@ EXPORT_SYMBOL_GPL(get_slice_psize); > void slice_set_user_psize(struct mm_struct *mm, unsigned int psize) > { > int index, mask_index; > - unsigned char *hpsizes; > - unsigned long flags, lpsizes; > + unsigned char *hpsizes, *lpsizes; > + unsigned long flags; > unsigned int old_psize; > int i; > > @@ -691,12 +692,14 @@ void slice_set_user_psize(struct mm_struct *mm, unsigned int psize) > wmb(); > > lpsizes = mm->context.low_slices_psize; > - for (i = 0; i < SLICE_NUM_LOW; i++) > - if (((lpsizes >> (i * 4)) & 0xf) == old_psize) > - lpsizes = (lpsizes & ~(0xful << (i * 4))) | > - (((unsigned long)psize) << (i * 4)); > - /* Assign the value back */ > - mm->context.low_slices_psize = lpsizes; > + for (i = 0; i < SLICE_NUM_LOW; i++) { > + mask_index = i & 0x1; > + index = i >> 1; > + if (((lpsizes[index] >> (mask_index * 4)) & 0xf) == old_psize) > + lpsizes[index] = (lpsizes[index] & > + ~(0xf << (mask_index * 4))) | > + (((unsigned long)psize) << (mask_index * 4)); > + } > > hpsizes = mm->context.high_slices_psize; > for (i = 0; i < SLICE_NUM_HIGH; i++) { > -- > 2.13.3