LinuxLists.cc - [PATCH 0/3] Highmem support for 32-bit RISC-V

2020-03-31 09:50:27

Subject: [PATCH 0/3] Highmem support for 32-bit RISC-V

With Highmem support, the kernel can map more than 1GB physical memory.

This patchset implements Highmem for RV32, referencing to mostly nds32
and others like arm and mips, and it has been tested on Andes A25MP platform.

Eric Lin (3):
riscv/mm: Add pkmap region and CONFIG_HIGHMEM
riscv/mm: Implement kmap() and kmap_atomic()
riscv/mm: Add pkmap in print_vm_layout()

arch/riscv/Kconfig | 18 +++++++
arch/riscv/include/asm/fixmap.h | 9 +++-
arch/riscv/include/asm/highmem.h | 49 +++++++++++++++++
arch/riscv/include/asm/pgtable.h | 27 ++++++++++
arch/riscv/mm/Makefile | 1 +
arch/riscv/mm/highmem.c | 74 +++++++++++++++++++++++++
arch/riscv/mm/init.c | 92 ++++++++++++++++++++++++++++++--
7 files changed, 266 insertions(+), 4 deletions(-)
create mode 100644 arch/riscv/include/asm/highmem.h
create mode 100644 arch/riscv/mm/highmem.c

--
2.17.0

2020-03-31 09:50:31

by Eric Lin

[permalink] [raw]

Subject: [PATCH 3/3] riscv/mm: Add pkmap in print_vm_layout()

When enabling CONFIG_HIGHMEM, lowmem will before pkmap
region and the memory layout will be like as below:

Virtual kernel memory layout:
lowmem : 0xc0000000 - 0xf5400000 ( 852 MB)
pkmap : 0xf5600000 - 0xf5800000 ( 2 MB)
fixmap : 0xf5800000 - 0xf5c00000 (4096 kB)
pci io : 0xf5c00000 - 0xf6c00000 ( 16 MB)
vmemmap : 0xf6c00000 - 0xf7bfffff ( 15 MB)
vmalloc : 0xf7c00000 - 0xffc00000 ( 128 MB)

Signed-off-by: Eric Lin <[email protected]>
Cc: Alan Kao <[email protected]>
---
arch/riscv/mm/init.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 59afb479176a..b32d558e3f99 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -80,6 +80,12 @@ static inline void print_mlm(char *name, unsigned long b, unsigned long t)
static void print_vm_layout(void)
{
pr_notice("Virtual kernel memory layout:\n");
+#ifdef CONFIG_HIGHMEM
+ print_mlm("lowmem", (unsigned long)PAGE_OFFSET,
+ (unsigned long)high_memory);
+ print_mlm("pkmap", (unsigned long)PKMAP_BASE,
+ (unsigned long)FIXADDR_START);
+#endif
print_mlk("fixmap", (unsigned long)FIXADDR_START,
(unsigned long)FIXADDR_TOP);
print_mlm("pci io", (unsigned long)PCI_IO_START,
@@ -88,8 +94,10 @@ static void print_vm_layout(void)
(unsigned long)VMEMMAP_END);
print_mlm("vmalloc", (unsigned long)VMALLOC_START,
(unsigned long)VMALLOC_END);
+#ifndef CONFIG_HIGHMEM
print_mlm("lowmem", (unsigned long)PAGE_OFFSET,
(unsigned long)high_memory);
+#endif
}
#else
static void print_vm_layout(void) { }
--
2.17.0

2020-03-31 09:51:24

by Eric Lin

[permalink] [raw]

Subject: [PATCH 2/3] riscv/mm: Implement kmap() and kmap_atomic()

Both kmap() and kmap_atomic() help kernel to create
temporary mappings from a highmem page.

Be aware that use kmap() might put calling function to sleep
and it cannot use in interrupt context. kmap_atomic() is an
atomic version of kmap() which can be used in interrupt context
and it is faster than kmap() because it doesn't hold a lock.

Here we preserve some memory slots from fixmap region for
kmap_atomic() and kmap() will use pkmap region.

Signed-off-by: Eric Lin <[email protected]>
Cc: Alan Kao <[email protected]>
---
arch/riscv/include/asm/fixmap.h | 9 +++-
arch/riscv/include/asm/highmem.h | 30 +++++++++++++
arch/riscv/include/asm/pgtable.h | 5 +++
arch/riscv/mm/Makefile | 1 +
arch/riscv/mm/highmem.c | 74 ++++++++++++++++++++++++++++++++
5 files changed, 118 insertions(+), 1 deletion(-)
create mode 100644 arch/riscv/mm/highmem.c

diff --git a/arch/riscv/include/asm/fixmap.h b/arch/riscv/include/asm/fixmap.h
index 42d2c42f3cc9..8dedc2bf2917 100644
--- a/arch/riscv/include/asm/fixmap.h
+++ b/arch/riscv/include/asm/fixmap.h
@@ -1,6 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (C) 2019 Western Digital Corporation or its affiliates.
+ * Copyright (C) 2020 Andes Technology Corporation
*/

#ifndef _ASM_RISCV_FIXMAP_H
@@ -10,6 +11,7 @@
#include <linux/sizes.h>
#include <asm/page.h>
#include <asm/pgtable.h>
+#include <asm/kmap_types.h>

#ifdef CONFIG_MMU
/*
@@ -28,7 +30,12 @@ enum fixed_addresses {
FIX_PTE,
FIX_PMD,
FIX_EARLYCON_MEM_BASE,
- __end_of_fixed_addresses
+#ifdef CONFIG_HIGHMEM
+ FIX_KMAP_RESERVED,
+ FIX_KMAP_BEGIN,
+ FIX_KMAP_END = FIX_KMAP_BEGIN + (KM_TYPE_NR * NR_CPUS),
+#endif
+ __end_of_fixed_addresses,
};

#define FIXMAP_PAGE_IO PAGE_KERNEL
diff --git a/arch/riscv/include/asm/highmem.h b/arch/riscv/include/asm/highmem.h
index 7fc79e58f607..ec7c83d55830 100644
--- a/arch/riscv/include/asm/highmem.h
+++ b/arch/riscv/include/asm/highmem.h
@@ -17,3 +17,33 @@
#define PKMAP_NR(virt) (((virt) - (PKMAP_BASE)) >> PAGE_SHIFT)
#define PKMAP_ADDR(nr) (PKMAP_BASE + ((nr) << PAGE_SHIFT))
#define kmap_prot PAGE_KERNEL
+
+static inline void flush_cache_kmaps(void)
+{
+ flush_cache_all();
+}
+
+/* Declarations for highmem.c */
+extern unsigned long highstart_pfn, highend_pfn;
+
+extern pte_t *pkmap_page_table;
+
+extern void *kmap_high(struct page *page);
+extern void kunmap_high(struct page *page);
+
+extern void kmap_init(void);
+
+/*
+ * The following functions are already defined by <linux/highmem.h>
+ * when CONFIG_HIGHMEM is not set.
+ */
+#ifdef CONFIG_HIGHMEM
+extern void *kmap(struct page *page);
+extern void kunmap(struct page *page);
+extern void *kmap_atomic(struct page *page);
+extern void __kunmap_atomic(void *kvaddr);
+extern void *kmap_atomic_pfn(unsigned long pfn);
+extern struct page *kmap_atomic_to_page(void *ptr);
+#endif
+
+#endif
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index d9a3769f1f4e..1a774d5a8bbc 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -200,6 +200,11 @@ static inline pgd_t *pgd_offset(const struct mm_struct *mm, unsigned long addr)
/* Locate an entry in the kernel page global directory */
#define pgd_offset_k(addr) pgd_offset(&init_mm, (addr))

+#ifdef CONFIG_HIGHMEM
+/* Locate an entry in the second-level page table */
+#define pmd_off_k(addr) pmd_offset((pud_t *)pgd_offset_k(addr), addr)
+#endif
+
static inline struct page *pmd_page(pmd_t pmd)
{
return pfn_to_page(pmd_val(pmd) >> _PAGE_PFN_SHIFT);
diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
index 50b7af58c566..6f9305afc632 100644
--- a/arch/riscv/mm/Makefile
+++ b/arch/riscv/mm/Makefile
@@ -16,6 +16,7 @@ obj-$(CONFIG_SMP) += tlbflush.o
endif
obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o
obj-$(CONFIG_KASAN) += kasan_init.o
+obj-$(CONFIG_HIGHMEM) += highmem.o

ifdef CONFIG_KASAN
KASAN_SANITIZE_kasan_init.o := n
diff --git a/arch/riscv/mm/highmem.c b/arch/riscv/mm/highmem.c
new file mode 100644
index 000000000000..b01ebe34619e
--- /dev/null
+++ b/arch/riscv/mm/highmem.c
@@ -0,0 +1,74 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2005-2020 Andes Technology Corporation
+ */
+
+#include <linux/export.h>
+#include <linux/highmem.h>
+#include <linux/sched.h>
+#include <linux/smp.h>
+#include <linux/interrupt.h>
+#include <asm/fixmap.h>
+#include <asm/pgtable.h>
+#include <asm/tlbflush.h>
+
+void *kmap(struct page *page)
+{
+ unsigned long vaddr;
+
+ might_sleep();
+ if (!PageHighMem(page))
+ return page_address(page);
+ vaddr = (unsigned long)kmap_high(page);
+ return (void *)vaddr;
+}
+EXPORT_SYMBOL(kmap);
+
+void kunmap(struct page *page)
+{
+ BUG_ON(in_interrupt());
+ if (!PageHighMem(page))
+ return;
+ kunmap_high(page);
+}
+EXPORT_SYMBOL(kunmap);
+
+void *kmap_atomic(struct page *page)
+{
+ unsigned int idx;
+ unsigned long vaddr;
+ int type;
+ pte_t *ptep;
+
+ preempt_disable();
+ pagefault_disable();
+
+ if (!PageHighMem(page))
+ return page_address(page);
+
+ type = kmap_atomic_idx_push();
+
+ idx = type + KM_TYPE_NR * smp_processor_id();
+ vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx);
+
+ ptep = pte_offset_kernel(pmd_off_k(vaddr), vaddr);
+ set_pte(ptep, mk_pte(page, kmap_prot));
+
+ return (void *)vaddr;
+}
+EXPORT_SYMBOL(kmap_atomic);
+
+void __kunmap_atomic(void *kvaddr)
+{
+ if (kvaddr >= (void *)FIXADDR_START && kvaddr < (void *)FIXADDR_TOP) {
+ unsigned long vaddr = (unsigned long)kvaddr;
+ pte_t *ptep;
+
+ kmap_atomic_idx_pop();
+ ptep = pte_offset_kernel(pmd_off_k(vaddr), vaddr);
+ set_pte(ptep, __pte(0));
+ }
+ pagefault_enable();
+ preempt_enable();
+}
+EXPORT_SYMBOL(__kunmap_atomic);
--
2.17.0

2020-04-02 09:32:40

by Arnd Bergmann

[permalink] [raw]

Subject: Re: [PATCH 0/3] Highmem support for 32-bit RISC-V

On Tue, Mar 31, 2020 at 11:34 AM Eric Lin <[email protected]> wrote:
>
> With Highmem support, the kernel can map more than 1GB physical memory.
>
> This patchset implements Highmem for RV32, referencing to mostly nds32
> and others like arm and mips, and it has been tested on Andes A25MP platform.

I would much prefer to not see highmem added to new architectures at all
if possible, see https://lwn.net/Articles/813201/ for some background.

For the arm32 architecture, we are thinking about implementing a
VMPLIT_4G_4G option to replace highmem in the long run. The most
likely way this would turn out at the moment looks like:

- have a 256MB region for vmalloc space at the top of the 4GB address
space, containing vmlinux, module, mmio mappings and vmalloc
allocations

- have 3.75GB starting at address zero for either user space or the
linear map.

- reserve one address space ID for kernel mappings to avoid tlb flushes
during normal context switches

- On any kernel entry, switch the page table to the one with the linear
mapping, and back to the user page table before returning to user space

- add a generic copy_from_user/copy_to_user implementation based
on get_user_pages() in asm-generic/uaccess.h, using memcpy()
to copy from/to the page in the linear map.

- possible have architectures override get_user/put_user to use a
cheaper access based on a page table switch to read individual
words if that is cheaper than get_user_pages().

There was an implementation of this for x86 a long time ago, but
it never got merged, mainly because there were no ASIDs on x86
at the time and the TLB flushing during context switch were really
expensive. As far as I can tell, all of the modern embedded cores
do have ASIDs, and unlike x86, most do not support more than 4GB
of physical RAM, so this scheme can work to replace highmem
in most of the remaining cases, and provide additional benefits
(larger user address range, higher separate of kernel/user addresses)
at a relatively small performance cost.

Arnd

2020-04-08 04:20:24

by Alan Kao

[permalink] [raw]

Subject: Re: [PATCH 0/3] Highmem support for 32-bit RISC-V

On Thu, Apr 02, 2020 at 11:31:37AM +0200, Arnd Bergmann wrote:
> On Tue, Mar 31, 2020 at 11:34 AM Eric Lin <[email protected]> wrote:
> >
> > With Highmem support, the kernel can map more than 1GB physical memory.
> >
> > This patchset implements Highmem for RV32, referencing to mostly nds32
> > and others like arm and mips, and it has been tested on Andes A25MP platform.
>
> I would much prefer to not see highmem added to new architectures at all
> if possible, see https://lwn.net/Articles/813201/ for some background.
>

Understood.

> For the arm32 architecture, we are thinking about implementing a
> VMPLIT_4G_4G option to replace highmem in the long run. The most
> likely way this would turn out at the moment looks like:
>

Thanks for sharing the status from ARM32. Is there any available branch
already? It would be good to have a reference implementation.

> - have a 256MB region for vmalloc space at the top of the 4GB address
> space, containing vmlinux, module, mmio mappings and vmalloc
> allocations
>
> - have 3.75GB starting at address zero for either user space or the
> linear map.
>
> - reserve one address space ID for kernel mappings to avoid tlb flushes
> during normal context switches
>
> - On any kernel entry, switch the page table to the one with the linear
> mapping, and back to the user page table before returning to user space
>

After some survey I found previous disccusion
(https://lkml.org/lkml/2019/4/24/2110). The 5.2-based patch ended up not
being merged. But at least we will have something to start if we want to.

Also interestingly, there was a PR for privileged spec that separates
addressing modes (https://github.com/riscv/riscv-isa-manual/pull/128) as
Sdas extension, but there was no progress afterwards.

Not very related to this thread, but there were some discussion about
ASID design in RISC-V (https://github.com/riscv/riscv-isa-manual/issues/348).
It is now in ratified 1.11 privileged spec.

> - add a generic copy_from_user/copy_to_user implementation based
> on get_user_pages() in asm-generic/uaccess.h, using memcpy()
> to copy from/to the page in the linear map.
>
> - possible have architectures override get_user/put_user to use a
> cheaper access based on a page table switch to read individual
> words if that is cheaper than get_user_pages().
>
> There was an implementation of this for x86 a long time ago, but
> it never got merged, mainly because there were no ASIDs on x86
> at the time and the TLB flushing during context switch were really
> expensive. As far as I can tell, all of the modern embedded cores
> do have ASIDs, and unlike x86, most do not support more than 4GB
> of physical RAM, so this scheme can work to replace highmem
> in most of the remaining cases, and provide additional benefits
> (larger user address range, higher separate of kernel/user addresses)
> at a relatively small performance cost.
>
> Arnd
>

It seems to me that VMSPLIT_4G_4G is quite different from other VMSPLITs,
because it requires much more changes.

Thanks for showing the stance of kernel community against HIGHMEM support.
The cited discussion thread is comprehensive and clear. Despite that RV32
users cannot get upstream support for their large memory, mechnisms like
VMSPLIT_4G_4G seems to be a promising way to go. That being said, to
support the theoretical 16G physical memory, eventually kmap* will still
be needed.

Alan

2020-04-08 15:23:54

by Arnd Bergmann

[permalink] [raw]

Subject: Re: [PATCH 0/3] Highmem support for 32-bit RISC-V

On Wed, Apr 8, 2020 at 5:52 AM Alan Kao <[email protected]> wrote:
> On Thu, Apr 02, 2020 at 11:31:37AM +0200, Arnd Bergmann wrote:
> > On Tue, Mar 31, 2020 at 11:34 AM Eric Lin <[email protected]> wrote:
> > For the arm32 architecture, we are thinking about implementing a
> > VMPLIT_4G_4G option to replace highmem in the long run. The most
> > likely way this would turn out at the moment looks like:
> >
>
> Thanks for sharing the status from ARM32. Is there any available branch
> already? It would be good to have a reference implementation.

No code yet, so far not much more than the ideas that I listed. We
are currently looking for someone interested in doing the work
or maybe sponsoring it if they have a strong interest.

If someone does it for RISC-V first, that would of course also help on ARM ;-)

> > - have a 256MB region for vmalloc space at the top of the 4GB address
> > space, containing vmlinux, module, mmio mappings and vmalloc
> > allocations
> >
> > - have 3.75GB starting at address zero for either user space or the
> > linear map.
> >
> > - reserve one address space ID for kernel mappings to avoid tlb flushes
> > during normal context switches
> >
> > - On any kernel entry, switch the page table to the one with the linear
> > mapping, and back to the user page table before returning to user space
> >
>
> After some survey I found previous disccusion
> (https://lkml.org/lkml/2019/4/24/2110). The 5.2-based patch ended up not
> being merged. But at least we will have something to start if we want to.

Ah, I see. What is the current requirement for ASIDs in hardware
implementations? If support for more than one address space is
optional, that would make the VMSPLIT_4G support fairly expensive
as it requires a full TLB flush for each context switch.

> Also interestingly, there was a PR for privileged spec that separates
> addressing modes (https://github.com/riscv/riscv-isa-manual/pull/128) as
> Sdas extension, but there was no progress afterwards.

Right, this sounds like the ideal implementation. This is what is done
in arch/s390 and probably a few of the others.

> Not very related to this thread, but there were some discussion about
> ASID design in RISC-V (https://github.com/riscv/riscv-isa-manual/issues/348).
> It is now in ratified 1.11 privileged spec.

Ok, so I suppose that would apply to about half the 32-bit implementations
and most of the future ones, but not the ones implementing the 1.10 spec
or earlier, right?

> It seems to me that VMSPLIT_4G_4G is quite different from other VMSPLITs,
> because it requires much more changes.
>
> Thanks for showing the stance of kernel community against HIGHMEM support.
> The cited discussion thread is comprehensive and clear. Despite that RV32
> users cannot get upstream support for their large memory, mechnisms like
> VMSPLIT_4G_4G seems to be a promising way to go. That being said, to
> support the theoretical 16G physical memory, eventually kmap* will still
> be needed.

I had not realized that Sv32 supports more than 4GB physical address
space at all. I agree that if someone puts that much RAM into a machine,
there are few alternatives to highmem (in theory one could use the
extra RAM for zswap/zram, but that's not a good replacement).

OTOH actually using more than 1GB or 2GB of physical memory on a
32-bit core is something that I expect to become completely obscure
in the future, as this is where using 32-bit cores tends to get
uneconomical. The situation that I observe across the currently supported
32-bit architectures in the kernel is that:

- There is an incentive to run 32-bit on machines with 1GB of RAM or less
if you have the choice, because of higher memory consumption and
cache utilization on 64-bit code. On systems with 2GB or more, the
cost of managing that memory using 32-bit code usually outweighs
the benefits and you should run at least a 64-bit kernel.

- The high end 32-bit cores (Arm Cortex-A15/A17, MIPS P5600,
PowerPC 750, Intel Pentium 4, Andes A15/D15, ...) are all obsolete
after the follow-on products use 64-bit cores on a smaller process
node, which end up being more capable, faster *and* cheaper.

- The 32-bit cores that do survive are based on simpler in-order
pipelines that are cheaper and can still beat the 64-bit cores in
terms of cost (mostly chip area, sometimes royalties), but not
performance. This includes Arm Cortex-A7, MIPS 24k and typical
RV32 cores.

- On an SoC with a cheap and simple CPU core, there is no point
in spending a lot of money/area/complexity on a high-end memory
controller. On single-core 32-bit SoCs, you usually end up with single
16 or 32-bit wide DDR2 memory controller, on an SMP system like
most quad-Cortex-A7, you have a 32-bit wide DDR3 controller, but no
DDR4 or LP-DDR3/4.

- The largest economical memory configuration on a 32-bit DDR3
controller is to have two 256Mx16 chips for a total of 1GB. You can
get 2GB with four chips using dual-channel controllers or 512Mx8
memory, but anything beyond that is much more expensive than
upgrading to a 64-bit SoC with LP-DDR4.

This is unlikely to change over time as 64-bit chips are also getting
cheaper and may replace more of the 32-bit chips we see today.
In particular, I expect to see multi-core chips moving to mostly
64-bit cores over time, while 32-bit chips keep using one or
occasionally two cores, further reducing the need for large and/or
fast memory.

Arnd

2020-04-14 16:36:48

Subject: [PATCH 0/3] Highmem support for 32-bit RISC-V

Subject: [PATCH 3/3] riscv/mm: Add pkmap in print_vm_layout()

Subject: [PATCH 2/3] riscv/mm: Implement kmap() and kmap_atomic()

Subject: Re: [PATCH 0/3] Highmem support for 32-bit RISC-V

Subject: Re: [PATCH 0/3] Highmem support for 32-bit RISC-V

Subject: Re: [PATCH 0/3] Highmem support for 32-bit RISC-V

Subject: Re: [PATCH 0/3] Highmem support for 32-bit RISC-V

Subject: Re: [PATCH 0/3] Highmem support for 32-bit RISC-V

Subject: Re: [PATCH 0/3] Highmem support for 32-bit RISC-V

Subject: Re: [PATCH 0/3] Highmem support for 32-bit RISC-V

Subject: Re: [PATCH 0/3] Highmem support for 32-bit RISC-V

Subject: Re: [PATCH 0/3] Highmem support for 32-bit RISC-V

Subject: Re: [PATCH 0/3] Highmem support for 32-bit RISC-V

Subject: ARM: static kernel in vmalloc space (was Re: [PATCH 0/3] Highmem support for 32-bit RISC-V)

Subject: Re: ARM: static kernel in vmalloc space (was Re: [PATCH 0/3] Highmem support for 32-bit RISC-V)

Subject: Re: ARM: static kernel in vmalloc space

Subject: Re: ARM: static kernel in vmalloc space

Subject: Re: ARM: static kernel in vmalloc space

Subject: Re: ARM: static kernel in vmalloc space

Subject: Re: ARM: static kernel in vmalloc space

Subject: Re: ARM: static kernel in vmalloc space

Subject: Re: ARM: static kernel in vmalloc space

Subject: Re: ARM: static kernel in vmalloc space

Subject: Re: ARM: static kernel in vmalloc space

Subject: Re: ARM: static kernel in vmalloc space

Subject: Re: ARM: static kernel in vmalloc space

Subject: Re: ARM: static kernel in vmalloc space

Subject: Re: ARM: static kernel in vmalloc space

Subject: ARM: vmsplit 4g/4g

Subject: Re: ARM: vmsplit 4g/4g

Subject: Re: ARM: vmsplit 4g/4g

Subject: Re: ARM: vmsplit 4g/4g

Subject: Re: ARM: vmsplit 4g/4g

Subject: Re: ARM: vmsplit 4g/4g

Subject: Re: ARM: vmsplit 4g/4g

Subject: Re: ARM: vmsplit 4g/4g

Subject: Re: ARM: vmsplit 4g/4g

Subject: Re: ARM: vmsplit 4g/4g

Subject: Re: ARM: vmsplit 4g/4g

Subject: Re: ARM: vmsplit 4g/4g

Subject: Re: ARM: vmsplit 4g/4g

Subject: Re: ARM: static kernel in vmalloc space

Subject: Re: ARM: static kernel in vmalloc space