2023-12-02 11:15:58

by Serge Semin

[permalink] [raw]
Subject: [PATCH v2 0/6] MIPS: mm: Fix some memory-related issues

Just recently I've rebased my MIPS32-related work from kernel 6.5-rc4 onto
the latest kernel 6.7-rc1 and immediately got into a bootup-time
mm-related bug (see patches 3-4 in this series). After fixing it I decided
it was time to submit for review the generic MIPS code fixes which I have
been collecting in my local repo for the last year. I was going to submit
them a bit later after I finished working on a patchset with my SoC
arch-specific changes, but since it was getting bigger and bigger, it
turned to be reasonable to spill out the generic part of series right away
especially seeing it might get to be useful in the most recent kernel.

So this series starts with the MIPS-specific dmi_early_remap()
implementation fix. It is utilized by the DMI driver in the framework of
the dmi_setup() method, which is called at the very early boot stage - in
setup_arch((). No VM and slab available at that stage which is required
for the ioremap_cache() to properly work. Thus it was a mistake to have
the dmi_early_remap() macro-function defined as ioremap_cache(). It should
have been ioremap() in first place.

After that goes a fix for the high-memory zone PFNs calculation procedure
on MIPS. It turned out that after some not that recent commit the
IO-memory or just non-memory PFNs got to the high-memory even though they
were directly reachable, thus should have been left in the normal zone.

Then a series of fixes for the recently discovered mm-bug is presented.
Any attempt to re-map the IO-memory with the cached attribute caused the
bootup procedure to crash with the "Unhandled kernel unaligned access"
message. After some digging I found out that the problem was in the
uninitialized IO-memory pages. Please see the patch "mips: Fix max_mapnr
being uninitialized on early stages" description for the detailed
explanation of the problem and suggested fix.

After that goes a patch which adds the slab availability check into the
ioremap_prot() method. Indeed VM mapping performs the slab allocation in
the framework of the get_vm_area() method. Thus any other than uncached
IO-remappings must be proceeded only at the stages when slab is available.
A similar fix was just recently added to the generic version of
ioremap_prot() in the framework of commit a5f616483110 ("mm/ioremap: add
slab availability checking in ioremap_prot").

The patchset is closed with a small improvement which sets the MIPS
board/machine name to the dump-stack module in order to print
arch-personalized oopses in the same way as it's done on ARM, ARM64,
RISC-V, etc.

That's it for today.) Thanks for review in advance. Any tests are very
welcome.

Link: https://lore.kernel.org/linux-mips/[email protected]/
Changelog v2:
- Drop the patches:
[PATCH 5/7] mm/mm_init.c: Extend init unavailable range doc info
[PATCH 6/7] mm/mm_init.c: Append '\n' to the unavailable ranges log-message
since they have been picked up by Andrew. (@Andrew)
- Replace ioremap_uc() with using ioremap() due to having the former one
deprecated. (@Arnd)
- Add a new patch:
[PATCH v2 5/6] mips: mm: add slab availability checking in ioremap_prot
picked up from the generic ioremap_prot() implementation.
- Extend early DMI mem remapping patch log with a note regarding the unsynched
caches concern. (@Jiaxun)

Signed-off-by: Serge Semin <[email protected]>
Cc: Alexey Malahov <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Aleksandar Rikalo <[email protected]>
Cc: Aleksandar Rikalo <[email protected]>
Cc: Dragan Mladjenovic <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Chao-ying Fu <[email protected]>
Cc: Yinglu Yang <[email protected]>
Cc: Tiezhu Yang <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]

Serge Semin (6):
mips: dmi: Fix early remap on MIPS32
mips: Fix incorrect max_low_pfn adjustment
mips: Fix max_mapnr being uninitialized on early stages
mips: Optimize max_mapnr init procedure
mips: mm: add slab availability checking in ioremap_prot
mips: Set dump-stack arch description

arch/mips/include/asm/dmi.h | 2 +-
arch/mips/kernel/prom.c | 2 ++
arch/mips/kernel/setup.c | 4 ++--
arch/mips/mm/init.c | 16 +++++++++-------
arch/mips/mm/ioremap.c | 4 ++++
5 files changed, 18 insertions(+), 10 deletions(-)

--
2.42.1


2023-12-02 11:15:58

by Serge Semin

[permalink] [raw]
Subject: [PATCH v2 6/6] mips: Set dump-stack arch description

In the framework of the MIPS architecture the mips_set_machine_name()
method is defined to set the machine name. The name currently is only used
in the /proc/cpuinfo file content generation. Let's have it utilized to
mach-personalize the dump-stack data too in a way it's done on ARM, ARM64,
RISC-V, etc.

Signed-off-by: Serge Semin <[email protected]>
---
arch/mips/kernel/prom.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/mips/kernel/prom.c b/arch/mips/kernel/prom.c
index f88ce78e13e3..6062e6fa589a 100644
--- a/arch/mips/kernel/prom.c
+++ b/arch/mips/kernel/prom.c
@@ -28,6 +28,8 @@ __init void mips_set_machine_name(const char *name)

strscpy(mips_machine_name, name, sizeof(mips_machine_name));
pr_info("MIPS: machine is %s\n", mips_get_machine_name());
+
+ dump_stack_set_arch_desc(name);
}

char *mips_get_machine_name(void)
--
2.42.1

2023-12-02 11:16:00

by Serge Semin

[permalink] [raw]
Subject: [PATCH v2 1/6] mips: dmi: Fix early remap on MIPS32

dmi_early_remap() has been defined as ioremap_cache() which on MIPS32 gets
to be converted to the VM-based mapping. DMI early remapping is performed
at the setup_arch() stage with no VM available. So calling the
dmi_early_remap() for MIPS32 causes the system to crash at the early boot
time. Fix that by converting dmi_early_remap() to the uncached remapping
which is always available on both 32 and 64-bits MIPS systems.

Note this change shall not cause any regressions on the current DMI
support implementation because on the early boot-up stage neither MIPS32
nor MIPS64 has the cacheable ioremapping support anyway.

Fixes: be8fa1cb444c ("MIPS: Add support for Desktop Management Interface (DMI)")
Signed-off-by: Serge Semin <[email protected]>

---

Note even though this patch is fully correct from the current
ioremap()-based semantics point of view and shall fix the denoted problem,
Jiaxun thinks that it's better to provide a different fix since
dmi_early_remap() doesn't work correctly on even Loongson64 - the only
currently DMI-equipped platform. In v1 discussion he promised to provide a
better fix for the problem. Until then let's consider this patch as the
only currently available solution.

Changelog v2:
- Replace ioremap_uc() with using ioremap() due to having the former one
deprecated. (@Arnd)
- Extend patch log with a note regarding the unsynched caches concern.
(@Jiaxun)
---
arch/mips/include/asm/dmi.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mips/include/asm/dmi.h b/arch/mips/include/asm/dmi.h
index 27415a288adf..dc397f630c66 100644
--- a/arch/mips/include/asm/dmi.h
+++ b/arch/mips/include/asm/dmi.h
@@ -5,7 +5,7 @@
#include <linux/io.h>
#include <linux/memblock.h>

-#define dmi_early_remap(x, l) ioremap_cache(x, l)
+#define dmi_early_remap(x, l) ioremap(x, l)
#define dmi_early_unmap(x, l) iounmap(x)
#define dmi_remap(x, l) ioremap_cache(x, l)
#define dmi_unmap(x) iounmap(x)
--
2.42.1

2023-12-02 11:16:00

by Serge Semin

[permalink] [raw]
Subject: [PATCH v2 2/6] mips: Fix incorrect max_low_pfn adjustment

max_low_pfn variable is incorrectly adjusted if the kernel is built with
high memory support and the later is detected in a running system, so the
memory which actually can be directly mapped is getting into the highmem
zone. See the ZONE_NORMAL range on my MIPS32r5 system:

> Zone ranges:
> DMA [mem 0x0000000000000000-0x0000000000ffffff]
> Normal [mem 0x0000000001000000-0x0000000007ffffff]
> HighMem [mem 0x0000000008000000-0x000000020fffffff]

while the zones are supposed to look as follows:

> Zone ranges:
> DMA [mem 0x0000000000000000-0x0000000000ffffff]
> Normal [mem 0x0000000001000000-0x000000001fffffff]
> HighMem [mem 0x0000000020000000-0x000000020fffffff]

Even though the physical memory within the range [0x08000000;0x20000000]
belongs to MMIO on our system, we don't really want it to be considered as
high memory since on MIPS32 that range still can be directly mapped.

Note there might be other problems caused by the max_low_pfn variable
misconfiguration. For instance high_memory variable is initialize with
virtual address corresponding to the max_low_pfn PFN, and by design it
must define the upper bound on direct map memory, then end of the normal
zone. That in its turn potentially may cause problems in accessing the
memory by means of the /dev/mem and /dev/kmem devices.

Let's fix the discovered misconfiguration then. It turns out the commit
a94e4f24ec83 ("MIPS: init: Drop boot_mem_map") didn't introduce the
max_low_pfn adjustment quite correct. If the kernel is built with high
memory support and the system is equipped with high memory, the
max_low_pfn variable will need to be initialized with PFN of the most
upper directly reachable memory address so the zone normal would be
correctly setup. On MIPS that PFN corresponds to PFN_DOWN(HIGHMEM_START).
If the system is built with no high memory support and one is detected in
the running system, we'll just need to adjust the max_pfn variable to
discard the found high memory from the system and leave the max_low_pfn as
is, since the later will be less than PFN_DOWN(HIGHMEM_START) anyway by
design of the for_each_memblock() loop performed a bit early in the
bootmem_init() method.

Fixes: a94e4f24ec83 ("MIPS: init: Drop boot_mem_map")
Signed-off-by: Serge Semin <[email protected]>
---
arch/mips/kernel/setup.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index 2d2ca024bd47..0461ab49e8f1 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -321,11 +321,11 @@ static void __init bootmem_init(void)
panic("Incorrect memory mapping !!!");

if (max_pfn > PFN_DOWN(HIGHMEM_START)) {
+ max_low_pfn = PFN_DOWN(HIGHMEM_START);
#ifdef CONFIG_HIGHMEM
- highstart_pfn = PFN_DOWN(HIGHMEM_START);
+ highstart_pfn = max_low_pfn;
highend_pfn = max_pfn;
#else
- max_low_pfn = PFN_DOWN(HIGHMEM_START);
max_pfn = max_low_pfn;
#endif
}
--
2.42.1

2023-12-02 11:16:02

by Serge Semin

[permalink] [raw]
Subject: [PATCH v2 5/6] mips: mm: add slab availability checking in ioremap_prot

Recent commit a5f616483110 ("mm/ioremap: add slab availability checking in
ioremap_prot") added the slab availability check to the generic
ioremap_prot() implementation. It is reasonable to be done for the
MIPS32-specific method too since it also relies on the
get_vm_area_caller() function (by means of get_vm_area()) which requires
the slab allocator being up and running before being called.

Signed-off-by: Serge Semin <[email protected]>
---
arch/mips/mm/ioremap.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/arch/mips/mm/ioremap.c b/arch/mips/mm/ioremap.c
index b6dad2fd5575..d8243d61ef32 100644
--- a/arch/mips/mm/ioremap.c
+++ b/arch/mips/mm/ioremap.c
@@ -72,6 +72,10 @@ void __iomem *ioremap_prot(phys_addr_t phys_addr, unsigned long size,
flags == _CACHE_UNCACHED)
return (void __iomem *) CKSEG1ADDR(phys_addr);

+ /* Early remaps should use the unmapped regions til' VM is available */
+ if (WARN_ON_ONCE(!slab_is_available()))
+ return NULL;
+
/*
* Don't allow anybody to remap RAM that may be allocated by the page
* allocator, since that could lead to races & data clobbering.
--
2.42.1

2023-12-02 11:16:14

by Serge Semin

[permalink] [raw]
Subject: [PATCH v2 3/6] mips: Fix max_mapnr being uninitialized on early stages

max_mapnr variable is utilized in the pfn_valid() method in order to
determine the upper PFN space boundary. Having it uninitialized
effectively makes any PFN passed to that method invalid. That in its turn
causes the kernel mm-subsystem occasion malfunctions even after the
max_mapnr variable is actually properly updated. For instance,
pfn_valid() is called in the init_unavailable_range() method in the
framework of the calls-chain on MIPS:
setup_arch()
+-> paging_init()
+-> free_area_init()
+-> memmap_init()
+-> memmap_init_zone_range()
+-> init_unavailable_range()

Since pfn_valid() always returns "false" value before max_mapnr is
initialized in the mem_init() method, any flatmem page-holes will be left
in the poisoned/uninitialized state including the IO-memory pages. Thus
any further attempts to map/remap the IO-memory by using MMU may fail.
In particular it happened in my case on attempt to map the SRAM region.
The kernel bootup procedure just crashed on the unhandled unaligned access
bug raised in the __update_cache() method:

> Unhandled kernel unaligned access[#1]:
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.7.0-rc1-XXX-dirty #2056
> ...
> Call Trace:
> [<8011ef9c>] __update_cache+0x88/0x1bc
> [<80385944>] ioremap_page_range+0x110/0x2a4
> [<80126948>] ioremap_prot+0x17c/0x1f4
> [<80711b80>] __devm_ioremap+0x8c/0x120
> [<80711e0c>] __devm_ioremap_resource+0xf4/0x218
> [<808bf244>] sram_probe+0x4f4/0x930
> [<80889d20>] platform_probe+0x68/0xec
> ...

Let's fix the problem by initializing the max_mapnr variable as soon as
the required data is available. In particular it can be done right in the
paging_init() method before free_area_init() is called since all the PFN
zone boundaries have already been calculated by that time.

Cc: [email protected]
Signed-off-by: Serge Semin <[email protected]>

---

Note I don't really know since what point that problem actually exists.
Based on the commits log it might had been persistent even before the
boot_mem_map allocator was dropped. On the other hand I hadn't seen it
actually come out before moving my working tree from kernel 6.5-rc4 to
6.7-rc1. So after updating the kernel I got the unhandled unaligned access
BUG() due to the access to compound head pointer the __update_cache()
method (see the commit log). After enabling the DEBUG_VM config I managed
to find out that the IO-memory pages were just left uninitialized and
poisoned:

> page:81367080 is uninitialized and poisoned (pfn 8192)
> page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> Kernel bug detected[#1]:
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.7.0-rc1-00298-g88721b1a9ad5-dirty
> $ 0 : 00000000 812d0000 00000034 dced7cdf
> $ 4 : dced7cdf 00594000 10000000 ffff00fe
> $ 8 : 8196bfe0 00000000 00000001 818458c0
> $12 : 00000000 00000000 00000000 00000216
> $16 : 00002800 81227b80 00000000 00000000
> $20 : 00000000 00000000 00000000 00000000
> $24 : 0000022b 818458c0
> $28 : 81968000 8196be68 00000000 803a0920
> Hi : 00000000
> Lo : 00000000
> epc : 8039d2a4 BUG+0x0/0x4
> ra : 803a0920 post_alloc_hook+0x0/0x128
> Status: 10000003 KERNEL EXL IE
> Cause : 00800424 (ExcCode 09)
> PrId : 0001a830 (MIPS P5600)
> Modules linked in:
> Process swapper/0 (pid: 1, threadinfo=81968000, task=819a0000, tls=00000000)
> Stack : 00000000 8101ccb0 00000000 8196bd00 00000000 80359768 818a8300 00000001
> 81139088 8114438c 8042e4f8 81297a2c 81297a2c 81255e90 819a1b50 dced7cdf
> 81297a2c 81297a2c 00000000 81227b80 00000000 81241168 811394b0 00000000
> 81140000 80e2cee0 00000000 00000000 00000000 00000000 00000000 819b0000
> 81140000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> ...
> Call Trace:
> [<8039d2a4>] BUG+0x0/0x4
> [<803a0920>] post_alloc_hook+0x0/0x128
>
> Code: 01001025 03e00008 24020001 <000c000d> 2403003c 27bdffd0 afb2001c 3c12812f 8e4269e4

Which in its turn made me digging deeper into the way the MMIO-space pages
are initialized. That's how I got into the pfn_valid() and
init_unavailable_range() working improperly on my setup.

Anyway none of the problems above I spotted on kernel 6.5-rc4. So what
actually triggered having them finally popped up isn't that easy to be
foundn seeing the involved code hasn't changed much.
---
arch/mips/mm/init.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c
index 5dcb525a8995..6e368a4658b5 100644
--- a/arch/mips/mm/init.c
+++ b/arch/mips/mm/init.c
@@ -422,7 +422,12 @@ void __init paging_init(void)
(highend_pfn - max_low_pfn) << (PAGE_SHIFT - 10));
max_zone_pfns[ZONE_HIGHMEM] = max_low_pfn;
}
+
+ max_mapnr = highend_pfn ? highend_pfn : max_low_pfn;
+#else
+ max_mapnr = max_low_pfn;
#endif
+ high_memory = (void *) __va(max_low_pfn << PAGE_SHIFT);

free_area_init(max_zone_pfns);
}
@@ -458,13 +463,6 @@ void __init mem_init(void)
*/
BUILD_BUG_ON(IS_ENABLED(CONFIG_32BIT) && (PFN_PTE_SHIFT > PAGE_SHIFT));

-#ifdef CONFIG_HIGHMEM
- max_mapnr = highend_pfn ? highend_pfn : max_low_pfn;
-#else
- max_mapnr = max_low_pfn;
-#endif
- high_memory = (void *) __va(max_low_pfn << PAGE_SHIFT);
-
maar_init();
memblock_free_all();
setup_zero_pages(); /* Setup zeroed pages. */
--
2.42.1

2023-12-21 14:56:54

by Thomas Bogendoerfer

[permalink] [raw]
Subject: Re: [PATCH v2 0/6] MIPS: mm: Fix some memory-related issues

On Sat, Dec 02, 2023 at 02:14:17PM +0300, Serge Semin wrote:
> Just recently I've rebased my MIPS32-related work from kernel 6.5-rc4 onto
> the latest kernel 6.7-rc1 and immediately got into a bootup-time
> mm-related bug (see patches 3-4 in this series). After fixing it I decided
> it was time to submit for review the generic MIPS code fixes which I have
> been collecting in my local repo for the last year. I was going to submit
> them a bit later after I finished working on a patchset with my SoC
> arch-specific changes, but since it was getting bigger and bigger, it
> turned to be reasonable to spill out the generic part of series right away
> especially seeing it might get to be useful in the most recent kernel.
>
> So this series starts with the MIPS-specific dmi_early_remap()
> implementation fix. It is utilized by the DMI driver in the framework of
> the dmi_setup() method, which is called at the very early boot stage - in
> setup_arch((). No VM and slab available at that stage which is required
> for the ioremap_cache() to properly work. Thus it was a mistake to have
> the dmi_early_remap() macro-function defined as ioremap_cache(). It should
> have been ioremap() in first place.
>
> After that goes a fix for the high-memory zone PFNs calculation procedure
> on MIPS. It turned out that after some not that recent commit the
> IO-memory or just non-memory PFNs got to the high-memory even though they
> were directly reachable, thus should have been left in the normal zone.
>
> Then a series of fixes for the recently discovered mm-bug is presented.
> Any attempt to re-map the IO-memory with the cached attribute caused the
> bootup procedure to crash with the "Unhandled kernel unaligned access"
> message. After some digging I found out that the problem was in the
> uninitialized IO-memory pages. Please see the patch "mips: Fix max_mapnr
> being uninitialized on early stages" description for the detailed
> explanation of the problem and suggested fix.
>
> After that goes a patch which adds the slab availability check into the
> ioremap_prot() method. Indeed VM mapping performs the slab allocation in
> the framework of the get_vm_area() method. Thus any other than uncached
> IO-remappings must be proceeded only at the stages when slab is available.
> A similar fix was just recently added to the generic version of
> ioremap_prot() in the framework of commit a5f616483110 ("mm/ioremap: add
> slab availability checking in ioremap_prot").
>
> The patchset is closed with a small improvement which sets the MIPS
> board/machine name to the dump-stack module in order to print
> arch-personalized oopses in the same way as it's done on ARM, ARM64,
> RISC-V, etc.
>
> That's it for today.) Thanks for review in advance. Any tests are very
> welcome.
>
> Link: https://lore.kernel.org/linux-mips/[email protected]/
> Changelog v2:
> - Drop the patches:
> [PATCH 5/7] mm/mm_init.c: Extend init unavailable range doc info
> [PATCH 6/7] mm/mm_init.c: Append '\n' to the unavailable ranges log-message
> since they have been picked up by Andrew. (@Andrew)
> - Replace ioremap_uc() with using ioremap() due to having the former one
> deprecated. (@Arnd)
> - Add a new patch:
> [PATCH v2 5/6] mips: mm: add slab availability checking in ioremap_prot
> picked up from the generic ioremap_prot() implementation.
> - Extend early DMI mem remapping patch log with a note regarding the unsynched
> caches concern. (@Jiaxun)
>
> Signed-off-by: Serge Semin <[email protected]>
> Cc: Alexey Malahov <[email protected]>
> Cc: Arnd Bergmann <[email protected]>
> Cc: Aleksandar Rikalo <[email protected]>
> Cc: Aleksandar Rikalo <[email protected]>
> Cc: Dragan Mladjenovic <[email protected]>
> Cc: Christophe Leroy <[email protected]>
> Cc: Baoquan He <[email protected]>
> Cc: Chao-ying Fu <[email protected]>
> Cc: Yinglu Yang <[email protected]>
> Cc: Tiezhu Yang <[email protected]>
> Cc: Mike Rapoport <[email protected]>
> Cc: Matthew Wilcox (Oracle) <[email protected]>
> Cc: Marc Zyngier <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
>
> Serge Semin (6):
> mips: dmi: Fix early remap on MIPS32
> mips: Fix incorrect max_low_pfn adjustment
> mips: Fix max_mapnr being uninitialized on early stages
> mips: Optimize max_mapnr init procedure
> mips: mm: add slab availability checking in ioremap_prot
> mips: Set dump-stack arch description
>
> arch/mips/include/asm/dmi.h | 2 +-
> arch/mips/kernel/prom.c | 2 ++
> arch/mips/kernel/setup.c | 4 ++--
> arch/mips/mm/init.c | 16 +++++++++-------
> arch/mips/mm/ioremap.c | 4 ++++
> 5 files changed, 18 insertions(+), 10 deletions(-)

series applied to mips-next.

Thomas.

--
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea. [ RFC1925, 2.3 ]