Received: by 2002:ac0:aed5:0:0:0:0:0 with SMTP id t21csp373781imb; Fri, 1 Mar 2019 03:06:08 -0800 (PST) X-Google-Smtp-Source: APXvYqyf/HXQJ+VAKuOT5yP3igFiH8Lkx+yuGCUyI/fjk7F0kOT8QF1gyGGUv2xbngKGN0uRJVg2 X-Received: by 2002:a17:902:2903:: with SMTP id g3mr4827535plb.222.1551438368875; Fri, 01 Mar 2019 03:06:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551438368; cv=none; d=google.com; s=arc-20160816; b=K0PA3xa7+qYJRPpsu/asBF+pbZT3I/vmv6BW4az+i1XnQvzf0BCGUqFudnciH94FY2 7Lv9bfS8kYHRMl7GQuzDovNVv6PS2bNVVHGLYhZV5NNaPsstlgag4Eiu3w+RktWdYhDb ad9X00G+pxoXgrhmjEO6cdKuQC3BicMqhaxjrdr+tW6zUZ51zVZtMBSrtsA609J6f4iP QGcUXrhI8zSWh8WcX/MydOLpVoTkn1lcNNag/T0CIqzgd+FqaVSshLYtN3MyaNs+tiKv ss67apYQRqmQysdA3r6MaoeYLGs4TedPSlWs2ySjBwQ6c5eIzwWJbJMKVOzp9B97LHiA oYcQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=0xswBHOlUs6t385pzGBHRKnKfEV3KkJK+5r23FX1n7g=; b=tPFDrVa1ALRKeZK8Nl+asrhI0k1mWJdVDTuMrzGMR/gTHg6TxM0AZx8Rxhj7BBd1oE Mcf8jltlhMuaOxROYhcVMzQ3uWOiOwLIljBGN32UYVqg585itolVPDDI5VT7JCcvucqE ir73vn8bBDYRi9kXcCWpCv7l9k3wCyCCWc1qMSJUiSSCH7XqN/4rqMSKmPwzIhLbeJRj DcnTabre/UVnwuMUzMs6ioaaRcp7FKmzR43RoR71V2P8/+DeNq6+XfoaQQLdSjLxM3BE vz2U7/Mc/NS1a+gWHzaqNoOF7xewJK2CxkWoR4qQgCIJH8CHTUbSps2Cqr8IkUfEZEAJ N40A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x9si20947034pfe.254.2019.03.01.03.05.53; Fri, 01 Mar 2019 03:06:08 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387822AbfCALC7 (ORCPT + 99 others); Fri, 1 Mar 2019 06:02:59 -0500 Received: from foss.arm.com ([217.140.101.70]:33106 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727932AbfCALC6 (ORCPT ); Fri, 1 Mar 2019 06:02:58 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1CC78EBD; Fri, 1 Mar 2019 03:02:58 -0800 (PST) Received: from [10.1.196.69] (e112269-lin.cambridge.arm.com [10.1.196.69]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 507303F575; Fri, 1 Mar 2019 03:02:54 -0800 (PST) Subject: Re: [PATCH v3 11/34] mips: mm: Add p?d_large() definitions To: Paul Burton Cc: Mark Rutland , Peter Zijlstra , Catalin Marinas , Dave Hansen , Will Deacon , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "H. Peter Anvin" , "Liang, Kan" , "x86@kernel.org" , Ingo Molnar , James Hogan , Arnd Bergmann , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Borislav Petkov , Andy Lutomirski , Thomas Gleixner , "linux-arm-kernel@lists.infradead.org" , Ard Biesheuvel , "linux-mips@vger.kernel.org" , Ralf Baechle , James Morse References: <20190227170608.27963-1-steven.price@arm.com> <20190227170608.27963-12-steven.price@arm.com> <20190228021526.bb6m3my46ohb4o6h@pburton-laptop> <74944d83-f3c0-ff02-590e-b7e5abcea485@arm.com> <20190228185526.hdryn2zsfign7vht@pburton-laptop> From: Steven Price Message-ID: <7e314e29-dab5-e941-60c8-05ea74747f4e@arm.com> Date: Fri, 1 Mar 2019 11:02:52 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.5.0 MIME-Version: 1.0 In-Reply-To: <20190228185526.hdryn2zsfign7vht@pburton-laptop> Content-Type: text/plain; charset=utf-8 Content-Language: en-GB Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 28/02/2019 18:55, Paul Burton wrote: > Hi Steven, > > On Thu, Feb 28, 2019 at 12:11:24PM +0000, Steven Price wrote: >> On 28/02/2019 02:15, Paul Burton wrote: >>> On Wed, Feb 27, 2019 at 05:05:45PM +0000, Steven Price wrote: >>>> For mips, we don't support large pages on 32 bit so add stubs returning 0. >>> >>> So far so good :) >>> >>>> For 64 bit look for _PAGE_HUGE flag being set. This means exposing the >>>> flag when !CONFIG_MIPS_HUGE_TLB_SUPPORT. >>> >>> Here I have to ask why? We could just return 0 like the mips32 case when >>> CONFIG_MIPS_HUGE_TLB_SUPPORT=n, let the compiler optimize the whole >>> thing out and avoid redundant work at runtime. >>> >>> This could be unified too in asm/pgtable.h - checking for >>> CONFIG_MIPS_HUGE_TLB_SUPPORT should be sufficient to cover the mips32 >>> case along with the subset of mips64 configurations without huge pages. >> >> The intention here is to define a new set of macros/functions which will >> always tell us whether we're at the leaf of a page table walk, whether >> or not huge pages are compiled into the kernel. Basically this allows >> the page walking code to be used on page tables other than user space, >> for instance the kernel page tables (which e.g. might use a large >> mapping for linear memory even if huge pages are not compiled in) or >> page tables from firmware (e.g. EFI on arm64). >> >> I'm not familiar enough with mips to know how it handles things like the >> linear map so I don't know how relevant that is, but I'm trying to >> introduce a new set of functions which differ from the existing >> p?d_huge() macros by not depending on whether these mappings could exist >> for a user space VMA (i.e. not depending on HUGETLB support and existing >> for all levels that architecturally they can occur at). > > Thanks for the explanation - the background helps. > > Right now for MIPS, with one exception, there'll be no difference > between a page being huge or large. So for the vast majority of kernels > with CONFIG_MIPS_HUGE_TLB_SUPPORT=n we should just return 0. > > The one exception I mentioned is old SGI IP27 support, which allows the > kernel to be mapped through the TLB & does that using 2x 16MB pages when > CONFIG_MAPPED_KERNEL=y. However even there your patch as-is won't pick > up on that for 2 reasons: > > 1) The pages in question don't appear to actually be recorded in the > page tables - they're just written straight into the TLB as wired > entries (ie. entries that will never be evicted). > > 2) Even if they were in the page tables the _PAGE_HUGE bit isn't set. > > Since those pages aren't recorded in the page tables anyway we'd either > need to: > > a) Add them to the page tables, and set the _PAGE_HUGE bit. > > b) Ignore them if the code you're working on won't be operating on the > memory mapping the kernel. > > For other platforms the kernel is run from unmapped memory, and for all > cases including IP27 the kernel will use unmapped memory to access > lowmem or peripherals when possible. That is, MIPS has virtual address > regions ((c)kseg[01] or xkphys) which are architecturally defined as > linear maps to physical memory & so VA->PA translation doesn't use the > TLB at all. > > So my thought would be that for almost everything we could just do: > > #define pmd_large(pmd) pmd_huge(pmd) > #define pud_large(pmd) pud_huge(pmd) > > And whether we need to do anything about IP27 depends on whether a) or > b) is chosen above. > > Or alternatively you could do something like: > > #ifdef _PAGE_HUGE > > static inline int pmd_large(pmd_t pmd) > { > return (pmd_val(pmd) & _PAGE_HUGE) != 0; > } > > static inline int pud_large(pud_t pud) > { > return (pud_val(pud) & _PAGE_HUGE) != 0; > } > > #else > # define pmd_large(pmd) 0 > # define pud_large(pud) 0 > #endif > > That would cover everything except for the IP27, but would make it pick > up the IP27 kernel pages automatically if someone later defines > _PAGE_HUGE for IP27 CONFIG_MAPPED_KERNEL=y & makes use of it for those > pages. Thanks for the detailed explanation. I think my preference is your change above (#ifdef _PAGE_HUGE) because I'm trying to stop people thinking p?d_large==p?d_huge. MIPS is a little different from other architectures in that the hardware doesn't walk the page tables, so there isn't a definitive answer as to whether there is a 'huge' bit in the tables or not - it actually does depend on the kernel configuration. For the IP27 case I think the current situation is probably fine - the intention is to walk the page tables, so even though the TLBs (and therefore the actual translations) might not match, at least p?d_large will accurately tell when the leaf of the page table tree has been reached. And as you say, using _PAGE_HUGE as the #ifdef means that should support be added the code should automatically make use of it. Thanks for your help, Steve