2022-10-18 22:35:12

by Giulio Benetti

[permalink] [raw]
Subject: [PATCH v2 1/2] ARM: mm: fix no-MMU ZERO_PAGE() implementation

Actually in no-MMU SoCs(i.e. i.MXRT) ZERO_PAGE(vaddr) expands to
```
virt_to_page(0)
```
that in order expands to:
```
pfn_to_page(virt_to_pfn(0))
```
and then virt_to_pfn(0) to:
```
#define virt_to_pfn(0) \
((((unsigned long)(0) - PAGE_OFFSET) >> PAGE_SHIFT) + \
PHYS_PFN_OFFSET)
```
where PAGE_OFFSET and PHYS_PFN_OFFSET are the DRAM offset(0x80000000) and
PAGE_SHIFT is 12. This way we obtain 16MB(0x01000000) summed to the base of
DRAM(0x80000000).
When ZERO_PAGE(0) is then used, for example in bio_add_page(), the page
gets an address that is out of DRAM bounds.
So instead of using fake virtual page 0 let's allocate a dedicated
zero_page during paging_init() and assign it to a global 'struct page *
empty_zero_page' the same way mmu.c does and it's the same approach used
in m68k with commit dc068f462179 as discussed here[0]. Then let's move
ZERO_PAGE() definition to the top of pgtable.h to be in common between
mmu.c and nommu.c.

[0]: https://lore.kernel.org/linux-m68k/[email protected]/T/#m1266ceb63ad140743174d6b3070364d3c9a5179b

Signed-off-by: Giulio Benetti <[email protected]>
---
V1->V2:
* improve commit log as suggested by Arnd Bergmann
---
arch/arm/include/asm/pgtable-nommu.h | 6 ------
arch/arm/include/asm/pgtable.h | 16 +++++++++-------
arch/arm/mm/nommu.c | 19 +++++++++++++++++++
3 files changed, 28 insertions(+), 13 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-nommu.h b/arch/arm/include/asm/pgtable-nommu.h
index d16aba48fa0a..090011394477 100644
--- a/arch/arm/include/asm/pgtable-nommu.h
+++ b/arch/arm/include/asm/pgtable-nommu.h
@@ -44,12 +44,6 @@

typedef pte_t *pte_addr_t;

-/*
- * ZERO_PAGE is a global shared page that is always zero: used
- * for zero-mapped memory areas etc..
- */
-#define ZERO_PAGE(vaddr) (virt_to_page(0))
-
/*
* Mark the prot value as uncacheable and unbufferable.
*/
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 78a532068fec..ef48a55e9af8 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -10,6 +10,15 @@
#include <linux/const.h>
#include <asm/proc-fns.h>

+#ifndef __ASSEMBLY__
+/*
+ * ZERO_PAGE is a global shared page that is always zero: used
+ * for zero-mapped memory areas etc..
+ */
+extern struct page *empty_zero_page;
+#define ZERO_PAGE(vaddr) (empty_zero_page)
+#endif
+
#ifndef CONFIG_MMU

#include <asm-generic/pgtable-nopud.h>
@@ -139,13 +148,6 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
*/

#ifndef __ASSEMBLY__
-/*
- * ZERO_PAGE is a global shared page that is always zero: used
- * for zero-mapped memory areas etc..
- */
-extern struct page *empty_zero_page;
-#define ZERO_PAGE(vaddr) (empty_zero_page)
-

extern pgd_t swapper_pg_dir[PTRS_PER_PGD];

diff --git a/arch/arm/mm/nommu.c b/arch/arm/mm/nommu.c
index c42debaded95..c1494a4dee25 100644
--- a/arch/arm/mm/nommu.c
+++ b/arch/arm/mm/nommu.c
@@ -26,6 +26,13 @@

unsigned long vectors_base;

+/*
+ * empty_zero_page is a special page that is used for
+ * zero-initialized data and COW.
+ */
+struct page *empty_zero_page;
+EXPORT_SYMBOL(empty_zero_page);
+
#ifdef CONFIG_ARM_MPU
struct mpu_rgn_info mpu_rgn_info;
#endif
@@ -148,9 +155,21 @@ void __init adjust_lowmem_bounds(void)
*/
void __init paging_init(const struct machine_desc *mdesc)
{
+ void *zero_page;
+
early_trap_init((void *)vectors_base);
mpu_setup();
+
+ /* allocate the zero page. */
+ zero_page = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
+ if (!zero_page)
+ panic("%s: Failed to allocate %lu bytes align=0x%lx\n",
+ __func__, PAGE_SIZE, PAGE_SIZE);
+
bootmem_init();
+
+ empty_zero_page = virt_to_page(zero_page);
+ flush_dcache_page(empty_zero_page);
}

/*
--
2.34.1


2022-10-18 22:36:20

by Giulio Benetti

[permalink] [raw]
Subject: [PATCH v2 2/2] ARM: mm: convert empty_zero_page to array for consistency

ARM architecture is the only one to have empty_zero_page to be a
struct page pointer, while in all other implementations empty_zero_page is
a data pointer or directly an array(the zero page itself). So let's convert
empty_zero_page to an array for consistency and to avoid an early
allocation+dcache flush. Being the array in .bss it will be cleared earlier
in a more linear way(and a bit faster) way.

Suggested-by: Arnd Bergmann <[email protected]>
Signed-off-by: Giulio Benetti <[email protected]>
---
V1->V2:
* create patch suggested by Arnd Bergmann
---
arch/arm/include/asm/pgtable.h | 4 ++--
arch/arm/mm/mmu.c | 10 +---------
arch/arm/mm/nommu.c | 14 +-------------
3 files changed, 4 insertions(+), 24 deletions(-)

diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index ef48a55e9af8..de402b345f55 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -15,8 +15,8 @@
* ZERO_PAGE is a global shared page that is always zero: used
* for zero-mapped memory areas etc..
*/
-extern struct page *empty_zero_page;
-#define ZERO_PAGE(vaddr) (empty_zero_page)
+extern unsigned long empty_zero_page[];
+#define ZERO_PAGE(vaddr) (virt_to_page(empty_zero_page))
#endif

#ifndef CONFIG_MMU
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 463fc2a8448f..f05a5471a45a 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -45,7 +45,7 @@ extern unsigned long __atags_pointer;
* empty_zero_page is a special page that is used for
* zero-initialized data and COW.
*/
-struct page *empty_zero_page;
+unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)] __page_aligned_bss;
EXPORT_SYMBOL(empty_zero_page);

/*
@@ -1760,8 +1760,6 @@ static void __init early_fixmap_shutdown(void)
*/
void __init paging_init(const struct machine_desc *mdesc)
{
- void *zero_page;
-
pr_debug("physical kernel sections: 0x%08llx-0x%08llx\n",
kernel_sec_start, kernel_sec_end);

@@ -1782,13 +1780,7 @@ void __init paging_init(const struct machine_desc *mdesc)

top_pmd = pmd_off_k(0xffff0000);

- /* allocate the zero page. */
- zero_page = early_alloc(PAGE_SIZE);
-
bootmem_init();
-
- empty_zero_page = virt_to_page(zero_page);
- __flush_dcache_page(NULL, empty_zero_page);
}

void __init early_mm_init(const struct machine_desc *mdesc)
diff --git a/arch/arm/mm/nommu.c b/arch/arm/mm/nommu.c
index c1494a4dee25..e0c3f59d1c5a 100644
--- a/arch/arm/mm/nommu.c
+++ b/arch/arm/mm/nommu.c
@@ -30,7 +30,7 @@ unsigned long vectors_base;
* empty_zero_page is a special page that is used for
* zero-initialized data and COW.
*/
-struct page *empty_zero_page;
+unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)] __page_aligned_bss;
EXPORT_SYMBOL(empty_zero_page);

#ifdef CONFIG_ARM_MPU
@@ -155,21 +155,9 @@ void __init adjust_lowmem_bounds(void)
*/
void __init paging_init(const struct machine_desc *mdesc)
{
- void *zero_page;
-
early_trap_init((void *)vectors_base);
mpu_setup();
-
- /* allocate the zero page. */
- zero_page = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
- if (!zero_page)
- panic("%s: Failed to allocate %lu bytes align=0x%lx\n",
- __func__, PAGE_SIZE, PAGE_SIZE);
-
bootmem_init();
-
- empty_zero_page = virt_to_page(zero_page);
- flush_dcache_page(empty_zero_page);
}

/*
--
2.34.1

2022-10-19 15:37:24

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] ARM: mm: convert empty_zero_page to array for consistency

On Wed, Oct 19, 2022 at 12:25:03AM +0200, Giulio Benetti wrote:
> ARM architecture is the only one to have empty_zero_page to be a
> struct page pointer, while in all other implementations empty_zero_page is
> a data pointer or directly an array(the zero page itself). So let's convert
> empty_zero_page to an array for consistency and to avoid an early
> allocation+dcache flush. Being the array in .bss it will be cleared earlier
> in a more linear way(and a bit faster) way.
>
> Suggested-by: Arnd Bergmann <[email protected]>
> Signed-off-by: Giulio Benetti <[email protected]>

I'm completely against this approach. It introduces inefficiencies in
paths we don't need, and also means that the zero page is at a fixed
location relative to the kernel, neither of which I like in the
slightest.

Thanks.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

2022-10-19 17:09:38

by Giulio Benetti

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] ARM: mm: convert empty_zero_page to array for consistency

Hello Russell,

On 19/10/22 16:44, Russell King (Oracle) wrote:
> On Wed, Oct 19, 2022 at 12:25:03AM +0200, Giulio Benetti wrote:
>> ARM architecture is the only one to have empty_zero_page to be a
>> struct page pointer, while in all other implementations empty_zero_page is
>> a data pointer or directly an array(the zero page itself). So let's convert
>> empty_zero_page to an array for consistency and to avoid an early
>> allocation+dcache flush. Being the array in .bss it will be cleared earlier
>> in a more linear way(and a bit faster) way.
>>
>> Suggested-by: Arnd Bergmann <[email protected]>
>> Signed-off-by: Giulio Benetti <[email protected]>
>
> I'm completely against this approach. It introduces inefficiencies in
> paths we don't need, and also means that the zero page is at a fixed
> location relative to the kernel, neither of which I like in the
> slightest.

I haven't considered those details, I'm pretty new in this topic.
I was thinking with a no-mmu approach in my mind, that's why the
.bss approach. And also the exposure of the entire array to the other
subsystem is not a good idea.

Thank you for pointing me

Best regads
--
Giulio Benetti
CEO/CTO@Benetti Engineering sas

2022-11-04 20:39:54

by Giulio Benetti

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] ARM: mm: fix no-MMU ZERO_PAGE() implementation

Hello Arnd, Russell, All,

is this patch ok or has it some changes to do?

While instead [PATCH 2/2] has a NAK and can be dropped.

Best regards
--
Giulio Benetti
Benetti Engineering sas

On 19/10/22 00:25, Giulio Benetti wrote:
> Actually in no-MMU SoCs(i.e. i.MXRT) ZERO_PAGE(vaddr) expands to
> ```
> virt_to_page(0)
> ```
> that in order expands to:
> ```
> pfn_to_page(virt_to_pfn(0))
> ```
> and then virt_to_pfn(0) to:
> ```
> #define virt_to_pfn(0) \
> ((((unsigned long)(0) - PAGE_OFFSET) >> PAGE_SHIFT) + \
> PHYS_PFN_OFFSET)
> ```
> where PAGE_OFFSET and PHYS_PFN_OFFSET are the DRAM offset(0x80000000) and
> PAGE_SHIFT is 12. This way we obtain 16MB(0x01000000) summed to the base of
> DRAM(0x80000000).
> When ZERO_PAGE(0) is then used, for example in bio_add_page(), the page
> gets an address that is out of DRAM bounds.
> So instead of using fake virtual page 0 let's allocate a dedicated
> zero_page during paging_init() and assign it to a global 'struct page *
> empty_zero_page' the same way mmu.c does and it's the same approach used
> in m68k with commit dc068f462179 as discussed here[0]. Then let's move
> ZERO_PAGE() definition to the top of pgtable.h to be in common between
> mmu.c and nommu.c.
>
> [0]: https://lore.kernel.org/linux-m68k/[email protected]/T/#m1266ceb63ad140743174d6b3070364d3c9a5179b
>
> Signed-off-by: Giulio Benetti <[email protected]>
> ---
> V1->V2:
> * improve commit log as suggested by Arnd Bergmann
> ---
> arch/arm/include/asm/pgtable-nommu.h | 6 ------
> arch/arm/include/asm/pgtable.h | 16 +++++++++-------
> arch/arm/mm/nommu.c | 19 +++++++++++++++++++
> 3 files changed, 28 insertions(+), 13 deletions(-)
>
> diff --git a/arch/arm/include/asm/pgtable-nommu.h b/arch/arm/include/asm/pgtable-nommu.h
> index d16aba48fa0a..090011394477 100644
> --- a/arch/arm/include/asm/pgtable-nommu.h
> +++ b/arch/arm/include/asm/pgtable-nommu.h
> @@ -44,12 +44,6 @@
>
> typedef pte_t *pte_addr_t;
>
> -/*
> - * ZERO_PAGE is a global shared page that is always zero: used
> - * for zero-mapped memory areas etc..
> - */
> -#define ZERO_PAGE(vaddr) (virt_to_page(0))
> -
> /*
> * Mark the prot value as uncacheable and unbufferable.
> */
> diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
> index 78a532068fec..ef48a55e9af8 100644
> --- a/arch/arm/include/asm/pgtable.h
> +++ b/arch/arm/include/asm/pgtable.h
> @@ -10,6 +10,15 @@
> #include <linux/const.h>
> #include <asm/proc-fns.h>
>
> +#ifndef __ASSEMBLY__
> +/*
> + * ZERO_PAGE is a global shared page that is always zero: used
> + * for zero-mapped memory areas etc..
> + */
> +extern struct page *empty_zero_page;
> +#define ZERO_PAGE(vaddr) (empty_zero_page)
> +#endif
> +
> #ifndef CONFIG_MMU
>
> #include <asm-generic/pgtable-nopud.h>
> @@ -139,13 +148,6 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
> */
>
> #ifndef __ASSEMBLY__
> -/*
> - * ZERO_PAGE is a global shared page that is always zero: used
> - * for zero-mapped memory areas etc..
> - */
> -extern struct page *empty_zero_page;
> -#define ZERO_PAGE(vaddr) (empty_zero_page)
> -
>
> extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
>
> diff --git a/arch/arm/mm/nommu.c b/arch/arm/mm/nommu.c
> index c42debaded95..c1494a4dee25 100644
> --- a/arch/arm/mm/nommu.c
> +++ b/arch/arm/mm/nommu.c
> @@ -26,6 +26,13 @@
>
> unsigned long vectors_base;
>
> +/*
> + * empty_zero_page is a special page that is used for
> + * zero-initialized data and COW.
> + */
> +struct page *empty_zero_page;
> +EXPORT_SYMBOL(empty_zero_page);
> +
> #ifdef CONFIG_ARM_MPU
> struct mpu_rgn_info mpu_rgn_info;
> #endif
> @@ -148,9 +155,21 @@ void __init adjust_lowmem_bounds(void)
> */
> void __init paging_init(const struct machine_desc *mdesc)
> {
> + void *zero_page;
> +
> early_trap_init((void *)vectors_base);
> mpu_setup();
> +
> + /* allocate the zero page. */
> + zero_page = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
> + if (!zero_page)
> + panic("%s: Failed to allocate %lu bytes align=0x%lx\n",
> + __func__, PAGE_SIZE, PAGE_SIZE);
> +
> bootmem_init();
> +
> + empty_zero_page = virt_to_page(zero_page);
> + flush_dcache_page(empty_zero_page);
> }
>
> /*


2022-11-04 20:41:11

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] ARM: mm: fix no-MMU ZERO_PAGE() implementation

On Fri, Nov 4, 2022, at 21:07, Giulio Benetti wrote:
> Hello Arnd, Russell, All,
>
> is this patch ok or has it some changes to do?

Looks ok to me, please add it to Russell's patch
tracker at:
https://www.arm.linux.org.uk/developer/patches/

The patch description could be improved a little ("Link:"
tag for the URL, avoiding the wikitext markup, etc), but
it's more important to actually get the bug fixed.

Arnd

2022-11-04 20:56:33

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] ARM: mm: fix no-MMU ZERO_PAGE() implementation

On Wed, Oct 19, 2022, at 00:25, Giulio Benetti wrote:
> Actually in no-MMU SoCs(i.e. i.MXRT) ZERO_PAGE(vaddr) expands to
> ```
> virt_to_page(0)
> ```
> that in order expands to:
> ```
> pfn_to_page(virt_to_pfn(0))
> ```
> and then virt_to_pfn(0) to:
> ```
> #define virt_to_pfn(0) \
> ((((unsigned long)(0) - PAGE_OFFSET) >> PAGE_SHIFT) + \
> PHYS_PFN_OFFSET)
> ```
> where PAGE_OFFSET and PHYS_PFN_OFFSET are the DRAM offset(0x80000000) and
> PAGE_SHIFT is 12. This way we obtain 16MB(0x01000000) summed to the base of
> DRAM(0x80000000).
> When ZERO_PAGE(0) is then used, for example in bio_add_page(), the page
> gets an address that is out of DRAM bounds.
> So instead of using fake virtual page 0 let's allocate a dedicated
> zero_page during paging_init() and assign it to a global 'struct page *
> empty_zero_page' the same way mmu.c does and it's the same approach used
> in m68k with commit dc068f462179 as discussed here[0]. Then let's move
> ZERO_PAGE() definition to the top of pgtable.h to be in common between
> mmu.c and nommu.c.
>
> [0]:
> https://lore.kernel.org/linux-m68k/[email protected]/T/#m1266ceb63ad140743174d6b3070364d3c9a5179b
>
> Signed-off-by: Giulio Benetti <[email protected]>

Reviewed-by: Arnd Bergmann <[email protected]>

2022-11-04 21:27:08

by Giulio Benetti

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] ARM: mm: fix no-MMU ZERO_PAGE() implementation

Hi Arnd,

On 04/11/22 21:28, Arnd Bergmann wrote:
> On Fri, Nov 4, 2022, at 21:07, Giulio Benetti wrote:
>> Hello Arnd, Russell, All,
>>
>> is this patch ok or has it some changes to do?
>
> Looks ok to me, please add it to Russell's patch
> tracker at:
> https://www.arm.linux.org.uk/developer/patches/

I've submitted it with your Reviewed-by:
https://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=9266/1

> The patch description could be improved a little ("Link:"
> tag for the URL, avoiding the wikitext markup, etc),

Oh, I was not aware of "Link:" and avoiding wikitext markup. Next
time I won't use it and I will use Link instead. I'll grep some other
commit log to find out a correct way.

> but
> it's more important to actually get the bug fixed.

Thank you for guiding me
Best regards
--
Giulio Benetti
CEO/CTO@Benetti Engineering sas