Received: by 2002:ac0:950c:0:0:0:0:0 with SMTP id f12csp4047368imc; Thu, 14 Mar 2019 11:01:13 -0700 (PDT) X-Google-Smtp-Source: APXvYqwQyiRmkCHDlCwqnvZbEQsEmfArvIBs9BqBDhEN9GVrfWQJbccYLHUF3iZVNzF4u10Ppq5e X-Received: by 2002:a63:2c8b:: with SMTP id s133mr46054281pgs.448.1552586473309; Thu, 14 Mar 2019 11:01:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1552586473; cv=none; d=google.com; s=arc-20160816; b=jJkiFcfu6qESTKgQKzt0SoSVK5MMBOy35cJgu/F/S5n3V7ImLmym4FuYHfPw8+GEh9 MTZLry7bWm5fxzoUKIzKlNQU1U/2ZyRYxEQuXmcfMavNq36Z6QK0LM2qkh8CvWMPj077 xQmChqkXu6wNJmHUEaJqXUFnCMjaF2Q7yWWBlzQERYY64tKPeoHETKyRqJWs+PGPlh+p ycPFEm+OwbGNBz0V15ADHF/sa9qeqMWepGOIQrzcGhOa0RFeA6dQIIrrO2FMq26m8jrQ GEO87oFQgJWMMwlRUvSkMf7MiynAwPfWEE1qxyRIkenRml9VL3vw49PMjUSVVslQ1JtR wjjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=eIGC0C5In0SH1lK8Am1FXe/fvjtgabXZQmRnvk65lLs=; b=hHz6tzAxHhoPhkhTZFWOB3yPnm679Scm4zruHcnX7/hdvtscbgIdMGy6/Hv7uuawxJ a5eysPhzmLVsog5tQh3DcrkbzCQZ8wGaDx4Q83wXm+xLCqL3fEaNkeGxfJeWSh0tyi7P wXILqllULcwhKMEtkXS6IM2SWmhiwMU5ri3cLu3Op6J5S4mOlaoNI6ed4mcpR6XwhC3h NyrJY+fhce17bFZ8Pukgte+xdTcHm33qVKam1hUEt/PsOtOM9K8DTSiYbk43EKVy9wZl LSXFzBuwZtdhcqDPQfFpOA60uOW7OVdLYHpkCuoK+ciVQB/uiFXxQJ84pyHFmPnbvcPB b0cA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@brainfault-org.20150623.gappssmtp.com header.s=20150623 header.b=iNoDmqx3; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d14si13409368pgn.536.2019.03.14.11.00.56; Thu, 14 Mar 2019 11:01:13 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@brainfault-org.20150623.gappssmtp.com header.s=20150623 header.b=iNoDmqx3; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727739AbfCNR6s (ORCPT + 99 others); Thu, 14 Mar 2019 13:58:48 -0400 Received: from mail-wm1-f68.google.com ([209.85.128.68]:36145 "EHLO mail-wm1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726424AbfCNR6r (ORCPT ); Thu, 14 Mar 2019 13:58:47 -0400 Received: by mail-wm1-f68.google.com with SMTP id e16so3682851wme.1 for ; Thu, 14 Mar 2019 10:58:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brainfault-org.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=eIGC0C5In0SH1lK8Am1FXe/fvjtgabXZQmRnvk65lLs=; b=iNoDmqx3l8Mo1FKmw3iL9Z819dRDQ6vpPRKnKIX11LvHcKTzD1B0Xxy0P08UIatPVS /SLutMfSdqIDIBreSEjcv2y2SqQcHbaOkYH1/JcSodRJWZjsXkiz2ay14YUtj3WCSE7A pS4TW7GRbuiV9d+2LAdWgZ+XuIKrjiLUhGvdRgsWqzJdr/kBU0PpZumRLONER5wLaHZv WEQfqe2Ht5tvR35RxxnYcKNybLG7Fp7v18/4wtE8IpGOhjEj5nKOOsaqqDjxHk3JzE5U bZwLrX5weCeqznAQgWPRES3gmwz2I5dCzpow+S13QiD28slsnHhhLGXaszKxtIqYCKeO tEZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=eIGC0C5In0SH1lK8Am1FXe/fvjtgabXZQmRnvk65lLs=; b=mf+GO9jvFhN4mcx2fSQfwa1f+MrBq25Zc4DFF/3bSg5wURElusNiCJlxE3uHaskA/i 24TLX0RO8ZpuPkS5wHzpAeDF86aaOj+RTO9x5QZnohfHs80Lho/UpwEbUNgLclfRqskS WntCIviVkwxj3PzwtQVWDs+eVQh6JINXXAMnEmx8gK1KKnjs9g8q7vZCs4f5aCrZXYum IIMbOQ+upQBMdXgeqI6+eFWUQ1XZXT1FM+x7uqfmEiUqWkvSoLfQY0AV6QIxgWkk9WYv 2APFVMMMljV6lSsTt3cQn90yu4MIvA8zaHJCNaaH0oB+iBNVJHqGX3iQRT26wlUd6Qao tOlA== X-Gm-Message-State: APjAAAViJwKU7Cf80+IdyCy/dKMOq1c5rZH6xZSIhfzpp7p8UohMVXSA hD6GQlh6GQoBFI/UtctgYOMZU4xOCYCtAGZg1eDktg== X-Received: by 2002:a1c:9c12:: with SMTP id f18mr3465802wme.16.1552586324060; Thu, 14 Mar 2019 10:58:44 -0700 (PDT) MIME-Version: 1.0 References: <20190312220752.128141-1-anup.patel@wdc.com> <20190312220752.128141-4-anup.patel@wdc.com> <20190313183121.GB28630@rapoport-lnx> <20190314065311.GC24380@rapoport-lnx> In-Reply-To: <20190314065311.GC24380@rapoport-lnx> From: Anup Patel Date: Thu, 14 Mar 2019 23:28:32 +0530 Message-ID: Subject: Re: [PATCH 3/3] RISC-V: Allow booting kernel from any 4KB aligned address To: Mike Rapoport Cc: Anup Patel , Palmer Dabbelt , Albert Ou , Atish Patra , Paul Walmsley , Christoph Hellwig , "linux-riscv@lists.infradead.org" , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 14, 2019 at 12:23 PM Mike Rapoport wrote: > > On Thu, Mar 14, 2019 at 02:36:01AM +0530, Anup Patel wrote: > > On Thu, Mar 14, 2019 at 12:01 AM Mike Rapoport wrote: > > > > > > On Tue, Mar 12, 2019 at 10:08:22PM +0000, Anup Patel wrote: > > > > Currently, we have to boot RISCV64 kernel from a 2MB aligned physical > > > > address and RISCV32 kernel from a 4MB aligned physical address. This > > > > constraint is because initial pagetable setup (i.e. setup_vm()) maps > > > > entire RAM using hugepages (i.e. 2MB for 3-level pagetable and 4MB for > > > > 2-level pagetable). > > > > > > > > Further, the above booting contraint also results in memory wastage > > > > because if we boot kernel from some address (which is not same as > > > > RAM start address) then RISCV kernel will map PAGE_OFFSET virtual address > > > > lineraly to physical address and memory between RAM start and > > > > will be reserved/unusable. > > > > > > > > For example, RISCV64 kernel booted from 0x80200000 will waste 2MB of RAM > > > > and RISCV32 kernel booted from 0x80400000 will waste 4MB of RAM. > > > > > > > > This patch re-writes the initial pagetable setup code to allow booting > > > > RISV32 and RISCV64 kernel from any 4KB (i.e. PAGE_SIZE) aligned address. > > > > > > > > To achieve this: > > > > 1. We map kernel, dtb and only some amount of RAM (few MBs) using 4KB > > > > mappings in setup_vm() (called from head.S) > > > > 2. Once we reach paging_init() (called from setup_arch()) after > > > > memblock setup, we map all available memory banks using 4KB > > > > mappings and memblock APIs. > > > > > > I'm not really familiar with RISC-V, but my guess would be that you'd get > > > worse TLB performance with 4KB mappings. Not mentioning the amount of > > > memory required for the page table itself. > > > > I agree we will see a hit in TLB performance due to 4KB mappings. > > > > To address this we can create, 2MB (or 4MB on 32bit system) mappings > > whenever load_pa is aligned to it otherwise we prefer 4KB mappings. In other > > words, we create bigger mappings whenever possible and fallback to 4KB > > mappings when not possible. > > > > This way if kernel is booted from 2MB (or 4MB) aligned address then we will > > see good TLB performance for kernel addresses. Also, users are still free to > > boot Linux RISC-V kernel from any 4KB aligned address. > > > > Of course, we will have to document this as part of Linux RISC-V booting > > requirements under Documentation/ (which does not exist currently). > > > > > > > > If the only goal is to utilize the physical memory below the kernel, it > > > simply should not be reserved at the first place, something like: > > > > Well, our goal was two-fold: > > > > 1. We wanted to unify boot-time alignment requirements for 32bit and > > 64bit RISC-V systems > > Can't they both start from 4MB aligned address provided the memory below > the kernel can be freed? Yes, they can both start from 4MB aligned address. > > > 2. Save memory by allowing users to place kernel just after the runtime > > firmware at starting of RAM. > > If the firmware should be alive after kernel boot, it's memory is the only > part that should be reserved below the kernel. Otherwise, the entire region > - can be free. > > Using 4K pages for the swapper_pg_dir is quite a change and I'm not > convinced its really justified. I understand your concern about TLB performance and more page tables. Not just 2MB/4MB mappings, we should be able to create even 1GB mappings as well for good TLB performance. I suggest we should use best possible mapping size (4KB, 2MB, or 1GB) based on alignment of kernel load address. This way users can boot from any 4KB aligned address and setup_vm() will try to use biggest possible mapping size. For example, If the kernel load address is aligned to 2MB then we 2MB mappings bigger mappings and use fewer page tables. Same thing possible for 1GB mappings as well. Regards, Anup