2021-12-07 06:10:13

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH] powerpc/603: Fix boot failure with DEBUG_PAGEALLOC and KFENCE

Allthough kernel text is always mapped with BATs, we still have
inittext mapped with pages, so TLB miss handling is required
when CONFIG_DEBUG_PAGEALLOC or CONFIG_KFENCE is set.

The final solution should be to set a BAT that also maps inittext
but that BAT then needs to be cleared at end of init, and it will
require more changes to be able to do it properly.

As DEBUG_PAGEALLOC or KFENCE are debugging, performance is not a big
deal so let's fix it simply for now to enable easy stable application.

Reported-by: Maxime Bizon <[email protected]>
Fixes: 035b19a15a98 ("powerpc/32s: Always map kernel text and rodata with BATs")
Cc: [email protected]
Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/kernel/head_book3s_32.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
index 68e5c0a7e99d..2e2a8211b17b 100644
--- a/arch/powerpc/kernel/head_book3s_32.S
+++ b/arch/powerpc/kernel/head_book3s_32.S
@@ -421,14 +421,14 @@ InstructionTLBMiss:
*/
/* Get PTE (linux-style) and check access */
mfspr r3,SPRN_IMISS
-#ifdef CONFIG_MODULES
+#if defined(CONFIG_MODULES) || defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_KFENCE)
lis r1, TASK_SIZE@h /* check if kernel address */
cmplw 0,r1,r3
#endif
mfspr r2, SPRN_SDR1
li r1,_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_EXEC | _PAGE_USER
rlwinm r2, r2, 28, 0xfffff000
-#ifdef CONFIG_MODULES
+#if defined(CONFIG_MODULES) || defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_KFENCE)
bgt- 112f
lis r2, (swapper_pg_dir - PAGE_OFFSET)@ha /* if kernel address, use */
li r1,_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_EXEC
--
2.33.1


2021-12-07 10:34:38

by Maxime Bizon

[permalink] [raw]
Subject: Re: [PATCH] powerpc/603: Fix boot failure with DEBUG_PAGEALLOC and KFENCE


On Tue, 2021-12-07 at 06:10 +0000, Christophe Leroy wrote:

Hello,

With the patch applied and

CONFIG_DEBUG_PAGEALLOC=y
CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT=y
CONFIG_DEBUG_VM=y

I get tons of this during boot:

[ 0.000000] Dentry cache hash table entries: 262144 (order: 8, 1048576 bytes, linear)
[ 0.000000] Inode-cache hash table entries: 131072 (order: 7, 524288 bytes, linear)
[ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[ 0.000000] ------------[ cut here ]------------
[ 0.000000] WARNING: CPU: 0 PID: 0 at arch/powerpc/mm/pgtable.c:194 set_pte_at+0x18/0x160
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.15.0+ #442
[ 0.000000] NIP: 80015ebc LR: 80016728 CTR: 800166e4
[ 0.000000] REGS: 80751dd0 TRAP: 0700 Not tainted (5.15.0+)
[ 0.000000] MSR: 00021032 <ME,IR,DR,RI> CR: 42228882 XER: 20000000
[ 0.000000]
[ 0.000000] GPR00: 800b8dc8 80751e80 806c6300 807311d8 807a1000 8ffffe84 80751ea8 00000000
[ 0.000000] GPR08: 007a1591 00000001 007a1180 00000000 42224882 00000000 3ff9c608 3fffd79c
[ 0.000000] GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 800166e4 807a2000
[ 0.000000] GPR24: 807a1fff 807311d8 807311d8 807a2000 80768804 00000000 807a1000 007a1180
[ 0.000000] NIP [80015ebc] set_pte_at+0x18/0x160
[ 0.000000] LR [80016728] set_page_attr+0x44/0xc0
[ 0.000000] Call Trace:
[ 0.000000] [80751e80] [80058570] console_unlock+0x340/0x428 (unreliable)
[ 0.000000] [80751ea0] [00000000] 0x0
[ 0.000000] [80751ec0] [800b8dc8] __apply_to_page_range+0x144/0x2a8
[ 0.000000] [80751f00] [80016918] __kernel_map_pages+0x54/0x64
[ 0.000000] [80751f10] [800cfeb0] __free_pages_ok+0x1b0/0x440
[ 0.000000] [80751f50] [805cfc8c] memblock_free_all+0x1d8/0x274
[ 0.000000] [80751f90] [805c5e0c] mem_init+0x3c/0xd0
[ 0.000000] [80751fb0] [805c0bdc] start_kernel+0x404/0x5c4
[ 0.000000] [80751ff0] [000033f0] 0x33f0
[ 0.000000] Instruction dump:
[ 0.000000] 7c630034 83e1000c 5463d97e 7c0803a6 38210010 4e800020 9421ffe0 93e1001c
[ 0.000000] 83e60000 81250000 71290001 41820014 <0fe00000> 7c0802a6 93c10018 90010024


--
Maxime




2021-12-09 06:08:55

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH] powerpc/603: Fix boot failure with DEBUG_PAGEALLOC and KFENCE

Christophe Leroy <[email protected]> writes:
> Le 07/12/2021 à 11:34, Maxime Bizon a écrit :
>>
>> On Tue, 2021-12-07 at 06:10 +0000, Christophe Leroy wrote:
>>
>> Hello,
>>
>> With the patch applied and
>>
>> CONFIG_DEBUG_PAGEALLOC=y
>> CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT=y
>> CONFIG_DEBUG_VM=y
>>
>> I get tons of this during boot:
>>
>> [ 0.000000] Dentry cache hash table entries: 262144 (order: 8, 1048576 bytes, linear)
>> [ 0.000000] Inode-cache hash table entries: 131072 (order: 7, 524288 bytes, linear)
>> [ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
>> [ 0.000000] ------------[ cut here ]------------
>> [ 0.000000] WARNING: CPU: 0 PID: 0 at arch/powerpc/mm/pgtable.c:194 set_pte_at+0x18/0x160
>> [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.15.0+ #442
>> [ 0.000000] NIP: 80015ebc LR: 80016728 CTR: 800166e4
>> [ 0.000000] REGS: 80751dd0 TRAP: 0700 Not tainted (5.15.0+)
>> [ 0.000000] MSR: 00021032 <ME,IR,DR,RI> CR: 42228882 XER: 20000000
>> [ 0.000000]
>> [ 0.000000] GPR00: 800b8dc8 80751e80 806c6300 807311d8 807a1000 8ffffe84 80751ea8 00000000
>> [ 0.000000] GPR08: 007a1591 00000001 007a1180 00000000 42224882 00000000 3ff9c608 3fffd79c
>> [ 0.000000] GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 800166e4 807a2000
>> [ 0.000000] GPR24: 807a1fff 807311d8 807311d8 807a2000 80768804 00000000 807a1000 007a1180
>> [ 0.000000] NIP [80015ebc] set_pte_at+0x18/0x160
>> [ 0.000000] LR [80016728] set_page_attr+0x44/0xc0
>> [ 0.000000] Call Trace:
>> [ 0.000000] [80751e80] [80058570] console_unlock+0x340/0x428 (unreliable)
>> [ 0.000000] [80751ea0] [00000000] 0x0
>> [ 0.000000] [80751ec0] [800b8dc8] __apply_to_page_range+0x144/0x2a8
>> [ 0.000000] [80751f00] [80016918] __kernel_map_pages+0x54/0x64
>> [ 0.000000] [80751f10] [800cfeb0] __free_pages_ok+0x1b0/0x440
>> [ 0.000000] [80751f50] [805cfc8c] memblock_free_all+0x1d8/0x274
>> [ 0.000000] [80751f90] [805c5e0c] mem_init+0x3c/0xd0
>> [ 0.000000] [80751fb0] [805c0bdc] start_kernel+0x404/0x5c4
>> [ 0.000000] [80751ff0] [000033f0] 0x33f0
>> [ 0.000000] Instruction dump:
>> [ 0.000000] 7c630034 83e1000c 5463d97e 7c0803a6 38210010 4e800020 9421ffe0 93e1001c
>> [ 0.000000] 83e60000 81250000 71290001 41820014 <0fe00000> 7c0802a6 93c10018 90010024
>>
>>
>
> That's unrelated to this patch.
>
> The problem is linked to patch c988cfd38e48 ("powerpc/32: use
> set_memory_attr()"), which changed from using __set_pte_at() to using
> set_memory_attr() which uses set_pte_at().
>
> set_pte_at() has additional checks and shall not be used to updating an
> existing PTE.
>
> Wondering if I should just use __set_pte_at() instead like in the past,
> or do like commit 9f7853d7609d ("powerpc/mm: Fix set_memory_*() against
> concurrent accesses") and use pte_update()
>
> Michael, Aneesh, any suggestion ?

The motivation for using pte_update() in that commit is that it does the
update atomically and also handles flushing the HPTE for 64-bit Hash.

But the books/32 version of pte_update() doesn't do that. In fact
there's some HPTE handling in __set_pte_at(), but then also a comment
saying it's handling in a subsequent flush_tlb_xxx().

So that doesn't really help make a decision :)

On the other hand, could you convert those set_memory_attr() calls to
change_memory_attr() and then eventually drop the former?

cheers

2022-01-21 19:11:13

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH] powerpc/603: Fix boot failure with DEBUG_PAGEALLOC and KFENCE

Michael, ping.

Le 07/12/2021 à 07:10, Christophe Leroy a écrit :
> Allthough kernel text is always mapped with BATs, we still have
> inittext mapped with pages, so TLB miss handling is required
> when CONFIG_DEBUG_PAGEALLOC or CONFIG_KFENCE is set.
>
> The final solution should be to set a BAT that also maps inittext
> but that BAT then needs to be cleared at end of init, and it will
> require more changes to be able to do it properly.
>
> As DEBUG_PAGEALLOC or KFENCE are debugging, performance is not a big
> deal so let's fix it simply for now to enable easy stable application.
>
> Reported-by: Maxime Bizon <[email protected]>
> Fixes: 035b19a15a98 ("powerpc/32s: Always map kernel text and rodata with BATs")
> Cc: [email protected]
> Signed-off-by: Christophe Leroy <[email protected]>
> ---
> arch/powerpc/kernel/head_book3s_32.S | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/kernel/head_book3s_32.S b/arch/powerpc/kernel/head_book3s_32.S
> index 68e5c0a7e99d..2e2a8211b17b 100644
> --- a/arch/powerpc/kernel/head_book3s_32.S
> +++ b/arch/powerpc/kernel/head_book3s_32.S
> @@ -421,14 +421,14 @@ InstructionTLBMiss:
> */
> /* Get PTE (linux-style) and check access */
> mfspr r3,SPRN_IMISS
> -#ifdef CONFIG_MODULES
> +#if defined(CONFIG_MODULES) || defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_KFENCE)
> lis r1, TASK_SIZE@h /* check if kernel address */
> cmplw 0,r1,r3
> #endif
> mfspr r2, SPRN_SDR1
> li r1,_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_EXEC | _PAGE_USER
> rlwinm r2, r2, 28, 0xfffff000
> -#ifdef CONFIG_MODULES
> +#if defined(CONFIG_MODULES) || defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_KFENCE)
> bgt- 112f
> lis r2, (swapper_pg_dir - PAGE_OFFSET)@ha /* if kernel address, use */
> li r1,_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_EXEC

2022-02-18 02:26:23

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH] powerpc/603: Fix boot failure with DEBUG_PAGEALLOC and KFENCE

On Tue, 7 Dec 2021 06:10:05 +0000, Christophe Leroy wrote:
> Allthough kernel text is always mapped with BATs, we still have
> inittext mapped with pages, so TLB miss handling is required
> when CONFIG_DEBUG_PAGEALLOC or CONFIG_KFENCE is set.
>
> The final solution should be to set a BAT that also maps inittext
> but that BAT then needs to be cleared at end of init, and it will
> require more changes to be able to do it properly.
>
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/603: Fix boot failure with DEBUG_PAGEALLOC and KFENCE
https://git.kernel.org/powerpc/c/9bb162fa26ed76031ed0e7dbc77ccea0bf977758

cheers