2021-10-30 18:34:22

by Rongwei Wang

[permalink] [raw]
Subject: [PATCH 0/2] fix bug when calling kexec_load()

Patch 1/2 mainly to fix the bug when calling kexec_load().
And a user case shown in commit log.

Patch 2/2 just make a very simple optimization, reducing
calls to page_address() in kexec_page_alloc().

Thanks!

Rongwei Wang (2):
arm64: trans_pgd: fix incorrect use of pmd_populate_kernel in
copy_pte()
arm64: kexec: reduce calls to page_address()

arch/arm64/kernel/machine_kexec.c | 6 ++++--
arch/arm64/mm/trans_pgd.c | 7 ++++---
2 files changed, 8 insertions(+), 5 deletions(-)

--
2.27.0


2021-10-30 18:34:55

by Rongwei Wang

[permalink] [raw]
Subject: [PATCH 1/2] arm64: trans_pgd: fix incorrect use of pmd_populate_kernel in copy_pte()

In commit 5de59884ac0e ("arm64: trans_pgd: pass NULL instead
of init_mm to *_populate functions"), simply replace init_mm
with NULL for pmd_populate_kernel. But in commit 59511cfd08f3
("arm64: mm: use XN table mapping attributes for user/kernel
mappings"), adding the check of mm context in
pmd_populate_kernel. And these changes will cause a crash when
executing copy_pte/trans_pgd.c, as follows:

kernel BUG at arch/arm64/include/asm/pgalloc.h:79!
Internal error: Oops - BUG: 0 [#1] SMP
Modules linked in: rfkill(E) aes_ce_blk(E) aes_ce_cipher(E) ...
CPU: 21 PID: 1617 Comm: a.out Kdump: loaded Tainted: ... 5.15.0-rc7-mm1+ #8
Hardware name: ECS, BIOS 0.0.0 02/06/2015
pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : trans_pgd_create_copy+0x4ac/0x4f0
lr : trans_pgd_create_copy+0x34c/0x4f0
sp : ffff80001bf2bc50
x29: ffff80001bf2bc50 x28: ffff0010067f1000 x27: ffff800011072000
x26: ffff001fffff8000 x25: ffff008000000000 x24: 0040000000000041
x23: 0040000000000001 x22: ffff80001bf2bd68 x21: ffff80001188ded8
x20: ffff800000000000 x19: ffff000000000000 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: 00000000200004c0
x14: ffff00003fffffff x13: ffff007fffffffff x12: ffff800010f882a8
x11: 0000000000face57 x10: 0000000000000001 x9 : 0000000000000000
x8 : ffff00100cece000 x7 : ffff001001c9f000 x6 : ffff00100ae40000
x5 : 0000000000000040 x4 : 0000000000000000 x3 : ffff001fffff7000
x2 : ffff000000200000 x1 : ffff000040000000 x0 : ffff00100cecd000
Call trace:
trans_pgd_create_copy+0x4ac/0x4f0
machine_kexec_post_load+0x94/0x3bc
do_kexec_load+0x11c/0x2e0
__arm64_sys_kexec_load+0xa8/0xf4
invoke_syscall+0x50/0x120
el0_svc_common.constprop.0+0x58/0x190
do_el0_svc+0x2c/0x90
el0_svc+0x28/0xe0
el0t_64_sync_handler+0xb0/0xb4
el0t_64_sync+0x180/0x184
Code: f90000c0 d5033a9f d5033fdf 17ffff7b (d4210000)
---[ end trace cc5461ffe1a085db ]---
Kernel panic - not syncing: Oops - BUG: Fatal exception

This bug can be reproduced by a user case:

void execute_kexec_load(void)
{
syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);

*(uint64_t*)0x200004c0 = 0;
*(uint64_t*)0x200004c8 = 0;
*(uint64_t*)0x200004d0 = 0;
*(uint64_t*)0x200004d8 = 0;
syscall(__NR_kexec_load, 0ul, 1ul, 0x200004c0ul, 0ul);
}

And this patch just make some simple changes, and including
replace pmd_populate_kernel with pmd_populate.

Fixes: 59511cfd08f3 ("arm64: mm: use XN table mapping attributes for user/kernel mappings")
Reported-by: Abaci <[email protected]>
Signed-off-by: Rongwei Wang <[email protected]>
---
arch/arm64/mm/trans_pgd.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c
index d7da8ca40d2e..3f1fc6cb9c9d 100644
--- a/arch/arm64/mm/trans_pgd.c
+++ b/arch/arm64/mm/trans_pgd.c
@@ -62,12 +62,13 @@ static int copy_pte(struct trans_pgd_info *info, pmd_t *dst_pmdp,
{
pte_t *src_ptep;
pte_t *dst_ptep;
+ struct page *page;
unsigned long addr = start;

- dst_ptep = trans_alloc(info);
- if (!dst_ptep)
+ page = virt_to_page(trans_alloc(info));
+ if (!page)
return -ENOMEM;
- pmd_populate_kernel(NULL, dst_pmdp, dst_ptep);
+ pmd_populate(NULL, dst_pmdp, page);
dst_ptep = pte_offset_kernel(dst_pmdp, start);

src_ptep = pte_offset_kernel(src_pmdp, start);
--
2.27.0

2021-10-31 12:26:57

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [PATCH 1/2] arm64: trans_pgd: fix incorrect use of pmd_populate_kernel in copy_pte()

On Sat, 30 Oct 2021 at 20:32, Rongwei Wang
<[email protected]> wrote:
>
> In commit 5de59884ac0e ("arm64: trans_pgd: pass NULL instead
> of init_mm to *_populate functions"), simply replace init_mm
> with NULL for pmd_populate_kernel. But in commit 59511cfd08f3
> ("arm64: mm: use XN table mapping attributes for user/kernel
> mappings"), adding the check of mm context in
> pmd_populate_kernel. And these changes will cause a crash when
> executing copy_pte/trans_pgd.c, as follows:
>
> kernel BUG at arch/arm64/include/asm/pgalloc.h:79!
> Internal error: Oops - BUG: 0 [#1] SMP
> Modules linked in: rfkill(E) aes_ce_blk(E) aes_ce_cipher(E) ...
> CPU: 21 PID: 1617 Comm: a.out Kdump: loaded Tainted: ... 5.15.0-rc7-mm1+ #8
> Hardware name: ECS, BIOS 0.0.0 02/06/2015
> pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : trans_pgd_create_copy+0x4ac/0x4f0
> lr : trans_pgd_create_copy+0x34c/0x4f0
> sp : ffff80001bf2bc50
> x29: ffff80001bf2bc50 x28: ffff0010067f1000 x27: ffff800011072000
> x26: ffff001fffff8000 x25: ffff008000000000 x24: 0040000000000041
> x23: 0040000000000001 x22: ffff80001bf2bd68 x21: ffff80001188ded8
> x20: ffff800000000000 x19: ffff000000000000 x18: 0000000000000000
> x17: 0000000000000000 x16: 0000000000000000 x15: 00000000200004c0
> x14: ffff00003fffffff x13: ffff007fffffffff x12: ffff800010f882a8
> x11: 0000000000face57 x10: 0000000000000001 x9 : 0000000000000000
> x8 : ffff00100cece000 x7 : ffff001001c9f000 x6 : ffff00100ae40000
> x5 : 0000000000000040 x4 : 0000000000000000 x3 : ffff001fffff7000
> x2 : ffff000000200000 x1 : ffff000040000000 x0 : ffff00100cecd000
> Call trace:
> trans_pgd_create_copy+0x4ac/0x4f0
> machine_kexec_post_load+0x94/0x3bc
> do_kexec_load+0x11c/0x2e0
> __arm64_sys_kexec_load+0xa8/0xf4
> invoke_syscall+0x50/0x120
> el0_svc_common.constprop.0+0x58/0x190
> do_el0_svc+0x2c/0x90
> el0_svc+0x28/0xe0
> el0t_64_sync_handler+0xb0/0xb4
> el0t_64_sync+0x180/0x184
> Code: f90000c0 d5033a9f d5033fdf 17ffff7b (d4210000)
> ---[ end trace cc5461ffe1a085db ]---
> Kernel panic - not syncing: Oops - BUG: Fatal exception
>
> This bug can be reproduced by a user case:
>
> void execute_kexec_load(void)
> {
> syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
> syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
> syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
>
> *(uint64_t*)0x200004c0 = 0;
> *(uint64_t*)0x200004c8 = 0;
> *(uint64_t*)0x200004d0 = 0;
> *(uint64_t*)0x200004d8 = 0;
> syscall(__NR_kexec_load, 0ul, 1ul, 0x200004c0ul, 0ul);
> }
>
> And this patch just make some simple changes, and including
> replace pmd_populate_kernel with pmd_populate.
>
> Fixes: 59511cfd08f3 ("arm64: mm: use XN table mapping attributes for user/kernel mappings")
> Reported-by: Abaci <[email protected]>
> Signed-off-by: Rongwei Wang <[email protected]>
> ---
> arch/arm64/mm/trans_pgd.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c
> index d7da8ca40d2e..3f1fc6cb9c9d 100644
> --- a/arch/arm64/mm/trans_pgd.c
> +++ b/arch/arm64/mm/trans_pgd.c
> @@ -62,12 +62,13 @@ static int copy_pte(struct trans_pgd_info *info, pmd_t *dst_pmdp,
> {
> pte_t *src_ptep;
> pte_t *dst_ptep;
> + struct page *page;
> unsigned long addr = start;
>
> - dst_ptep = trans_alloc(info);
> - if (!dst_ptep)
> + page = virt_to_page(trans_alloc(info));
> + if (!page)
> return -ENOMEM;
> - pmd_populate_kernel(NULL, dst_pmdp, dst_ptep);
> + pmd_populate(NULL, dst_pmdp, page);

Are you sure this truly fixes the underlying issue rather than the symptom?

pmd_populate() will create a table entry with the PXN attribute set,
which means nothing below it will be executable by the kernel,
regardless of the executable permissions at the PTE level.


> dst_ptep = pte_offset_kernel(dst_pmdp, start);
>
> src_ptep = pte_offset_kernel(src_pmdp, start);
> --
> 2.27.0
>

2021-11-01 02:15:43

by Rongwei Wang

[permalink] [raw]
Subject: Re: [PATCH 1/2] arm64: trans_pgd: fix incorrect use of pmd_populate_kernel in copy_pte()



On 10/31/21 8:25 PM, Ard Biesheuvel wrote:
> On Sat, 30 Oct 2021 at 20:32, Rongwei Wang
> <[email protected]> wrote:
>>
>> In commit 5de59884ac0e ("arm64: trans_pgd: pass NULL instead
>> of init_mm to *_populate functions"), simply replace init_mm
>> with NULL for pmd_populate_kernel. But in commit 59511cfd08f3
>> ("arm64: mm: use XN table mapping attributes for user/kernel
>> mappings"), adding the check of mm context in
>> pmd_populate_kernel. And these changes will cause a crash when
>> executing copy_pte/trans_pgd.c, as follows:
>>
>> kernel BUG at arch/arm64/include/asm/pgalloc.h:79!
>> Internal error: Oops - BUG: 0 [#1] SMP
>> Modules linked in: rfkill(E) aes_ce_blk(E) aes_ce_cipher(E) ...
>> CPU: 21 PID: 1617 Comm: a.out Kdump: loaded Tainted: ... 5.15.0-rc7-mm1+ #8
>> Hardware name: ECS, BIOS 0.0.0 02/06/2015
>> pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>> pc : trans_pgd_create_copy+0x4ac/0x4f0
>> lr : trans_pgd_create_copy+0x34c/0x4f0
>> sp : ffff80001bf2bc50
>> x29: ffff80001bf2bc50 x28: ffff0010067f1000 x27: ffff800011072000
>> x26: ffff001fffff8000 x25: ffff008000000000 x24: 0040000000000041
>> x23: 0040000000000001 x22: ffff80001bf2bd68 x21: ffff80001188ded8
>> x20: ffff800000000000 x19: ffff000000000000 x18: 0000000000000000
>> x17: 0000000000000000 x16: 0000000000000000 x15: 00000000200004c0
>> x14: ffff00003fffffff x13: ffff007fffffffff x12: ffff800010f882a8
>> x11: 0000000000face57 x10: 0000000000000001 x9 : 0000000000000000
>> x8 : ffff00100cece000 x7 : ffff001001c9f000 x6 : ffff00100ae40000
>> x5 : 0000000000000040 x4 : 0000000000000000 x3 : ffff001fffff7000
>> x2 : ffff000000200000 x1 : ffff000040000000 x0 : ffff00100cecd000
>> Call trace:
>> trans_pgd_create_copy+0x4ac/0x4f0
>> machine_kexec_post_load+0x94/0x3bc
>> do_kexec_load+0x11c/0x2e0
>> __arm64_sys_kexec_load+0xa8/0xf4
>> invoke_syscall+0x50/0x120
>> el0_svc_common.constprop.0+0x58/0x190
>> do_el0_svc+0x2c/0x90
>> el0_svc+0x28/0xe0
>> el0t_64_sync_handler+0xb0/0xb4
>> el0t_64_sync+0x180/0x184
>> Code: f90000c0 d5033a9f d5033fdf 17ffff7b (d4210000)
>> ---[ end trace cc5461ffe1a085db ]---
>> Kernel panic - not syncing: Oops - BUG: Fatal exception
>>
>> This bug can be reproduced by a user case:
>>
>> void execute_kexec_load(void)
>> {
>> syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
>> syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
>> syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
>>
>> *(uint64_t*)0x200004c0 = 0;
>> *(uint64_t*)0x200004c8 = 0;
>> *(uint64_t*)0x200004d0 = 0;
>> *(uint64_t*)0x200004d8 = 0;
>> syscall(__NR_kexec_load, 0ul, 1ul, 0x200004c0ul, 0ul);
>> }
>>
>> And this patch just make some simple changes, and including
>> replace pmd_populate_kernel with pmd_populate.
>>
>> Fixes: 59511cfd08f3 ("arm64: mm: use XN table mapping attributes for user/kernel mappings")
>> Reported-by: Abaci <[email protected]>
>> Signed-off-by: Rongwei Wang <[email protected]>
>> ---
>> arch/arm64/mm/trans_pgd.c | 7 ++++---
>> 1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c
>> index d7da8ca40d2e..3f1fc6cb9c9d 100644
>> --- a/arch/arm64/mm/trans_pgd.c
>> +++ b/arch/arm64/mm/trans_pgd.c
>> @@ -62,12 +62,13 @@ static int copy_pte(struct trans_pgd_info *info, pmd_t *dst_pmdp,
>> {
>> pte_t *src_ptep;
>> pte_t *dst_ptep;
>> + struct page *page;
>> unsigned long addr = start;
>>
>> - dst_ptep = trans_alloc(info);
>> - if (!dst_ptep)
>> + page = virt_to_page(trans_alloc(info));
>> + if (!page)
>> return -ENOMEM;
>> - pmd_populate_kernel(NULL, dst_pmdp, dst_ptep);
>> + pmd_populate(NULL, dst_pmdp, page);
>
> Are you sure this truly fixes the underlying issue rather than the symptom?
>
Hi Ard

I just found bug line on 'VM_BUG_ON(mm != &init_mm)' shown below, seems
a obvious incorrect use of 'pmd_populate_kernel' in copy_pte. It seems
these changes were introduced in this year.

static inline void
pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp, pte_t *ptep)
{
VM_BUG_ON(mm != &init_mm);
__pmd_populate(pmdp, __pa(ptep), PMD_TYPE_TABLE | PMD_TABLE_UXN);
}

And I had run some testcases, not triggered this bug any more.
If I missing something, please remind me! And I will check it
again.

Thanks!

> pmd_populate() will create a table entry with the PXN attribute set,
> which means nothing below it will be executable by the kernel,
> regardless of the executable permissions at the PTE level.
>
>
>> dst_ptep = pte_offset_kernel(dst_pmdp, start);
>>
>> src_ptep = pte_offset_kernel(src_pmdp, start);
>> --
>> 2.27.0
>>

2021-11-05 07:55:39

by Rongwei Wang

[permalink] [raw]
Subject: Re: [PATCH 1/2] arm64: trans_pgd: fix incorrect use of pmd_populate_kernel in copy_pte()



On 10/31/21 8:25 PM, Ard Biesheuvel wrote:
> On Sat, 30 Oct 2021 at 20:32, Rongwei Wang
> <[email protected]> wrote:
>>
>> In commit 5de59884ac0e ("arm64: trans_pgd: pass NULL instead
>> of init_mm to *_populate functions"), simply replace init_mm
>> with NULL for pmd_populate_kernel. But in commit 59511cfd08f3
>> ("arm64: mm: use XN table mapping attributes for user/kernel
>> mappings"), adding the check of mm context in
>> pmd_populate_kernel. And these changes will cause a crash when
>> executing copy_pte/trans_pgd.c, as follows:
>>
>> kernel BUG at arch/arm64/include/asm/pgalloc.h:79!
>> Internal error: Oops - BUG: 0 [#1] SMP
>> Modules linked in: rfkill(E) aes_ce_blk(E) aes_ce_cipher(E) ...
>> CPU: 21 PID: 1617 Comm: a.out Kdump: loaded Tainted: ... 5.15.0-rc7-mm1+ #8
>> Hardware name: ECS, BIOS 0.0.0 02/06/2015
>> pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>> pc : trans_pgd_create_copy+0x4ac/0x4f0
>> lr : trans_pgd_create_copy+0x34c/0x4f0
>> sp : ffff80001bf2bc50
>> x29: ffff80001bf2bc50 x28: ffff0010067f1000 x27: ffff800011072000
>> x26: ffff001fffff8000 x25: ffff008000000000 x24: 0040000000000041
>> x23: 0040000000000001 x22: ffff80001bf2bd68 x21: ffff80001188ded8
>> x20: ffff800000000000 x19: ffff000000000000 x18: 0000000000000000
>> x17: 0000000000000000 x16: 0000000000000000 x15: 00000000200004c0
>> x14: ffff00003fffffff x13: ffff007fffffffff x12: ffff800010f882a8
>> x11: 0000000000face57 x10: 0000000000000001 x9 : 0000000000000000
>> x8 : ffff00100cece000 x7 : ffff001001c9f000 x6 : ffff00100ae40000
>> x5 : 0000000000000040 x4 : 0000000000000000 x3 : ffff001fffff7000
>> x2 : ffff000000200000 x1 : ffff000040000000 x0 : ffff00100cecd000
>> Call trace:
>> trans_pgd_create_copy+0x4ac/0x4f0
>> machine_kexec_post_load+0x94/0x3bc
>> do_kexec_load+0x11c/0x2e0
>> __arm64_sys_kexec_load+0xa8/0xf4
>> invoke_syscall+0x50/0x120
>> el0_svc_common.constprop.0+0x58/0x190
>> do_el0_svc+0x2c/0x90
>> el0_svc+0x28/0xe0
>> el0t_64_sync_handler+0xb0/0xb4
>> el0t_64_sync+0x180/0x184
>> Code: f90000c0 d5033a9f d5033fdf 17ffff7b (d4210000)
>> ---[ end trace cc5461ffe1a085db ]---
>> Kernel panic - not syncing: Oops - BUG: Fatal exception
>>
>> This bug can be reproduced by a user case:
>>
>> void execute_kexec_load(void)
>> {
>> syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
>> syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
>> syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
>>
>> *(uint64_t*)0x200004c0 = 0;
>> *(uint64_t*)0x200004c8 = 0;
>> *(uint64_t*)0x200004d0 = 0;
>> *(uint64_t*)0x200004d8 = 0;
>> syscall(__NR_kexec_load, 0ul, 1ul, 0x200004c0ul, 0ul);
>> }
>>
>> And this patch just make some simple changes, and including
>> replace pmd_populate_kernel with pmd_populate.
>>
>> Fixes: 59511cfd08f3 ("arm64: mm: use XN table mapping attributes for user/kernel mappings")
>> Reported-by: Abaci <[email protected]>
>> Signed-off-by: Rongwei Wang <[email protected]>
>> ---
>> arch/arm64/mm/trans_pgd.c | 7 ++++---
>> 1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c
>> index d7da8ca40d2e..3f1fc6cb9c9d 100644
>> --- a/arch/arm64/mm/trans_pgd.c
>> +++ b/arch/arm64/mm/trans_pgd.c
>> @@ -62,12 +62,13 @@ static int copy_pte(struct trans_pgd_info *info, pmd_t *dst_pmdp,
>> {
>> pte_t *src_ptep;
>> pte_t *dst_ptep;
>> + struct page *page;
>> unsigned long addr = start;
>>
>> - dst_ptep = trans_alloc(info);
>> - if (!dst_ptep)
>> + page = virt_to_page(trans_alloc(info));
>> + if (!page)
>> return -ENOMEM;
>> - pmd_populate_kernel(NULL, dst_pmdp, dst_ptep);
>> + pmd_populate(NULL, dst_pmdp, page);
>
> Are you sure this truly fixes the underlying issue rather than the symptom?
>
> pmd_populate() will create a table entry with the PXN attribute set,
> which means nothing below it will be executable by the kernel,
> regardless of the executable permissions at the PTE level.
Hi Ard

I had check here again because of you reminder. Indeed, It's incorrect
fix just replace pmd_populate_kernel with pmd_populate in above patch.

And how about the following method to be used to fix this bug? In fact,
the following code just restore one modification in 5de59884ac0e
("arm64: trans_pgd: pass NULL instead of init_mm to *_populate functions").

diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c
index d7da8ca40d2e..5275ca312360 100644
--- a/arch/arm64/mm/trans_pgd.c
+++ b/arch/arm64/mm/trans_pgd.c
@@ -67,7 +67,7 @@ static int copy_pte(struct trans_pgd_info *info, pmd_t
*dst_pmdp,
dst_ptep = trans_alloc(info);
if (!dst_ptep)
return -ENOMEM;
- pmd_populate_kernel(NULL, dst_pmdp, dst_ptep);
+ pmd_populate_kernel(&init_mm, dst_pmdp, dst_ptep);
dst_ptep = pte_offset_kernel(dst_pmdp, start);

src_ptep = pte_offset_kernel(src_pmdp, start);

Thanks!
>
>
>> dst_ptep = pte_offset_kernel(dst_pmdp, start);
>>
>> src_ptep = pte_offset_kernel(src_pmdp, start);
>> --
>> 2.27.0
>>

2021-11-14 20:17:03

by Rongwei Wang

[permalink] [raw]
Subject: [PATCH v2 0/2] fix bug when calling kexec_load()

Patch 1/2 mainly to fix the bug when calling kexec_load().
And a user case shown in commit log.

Patch 2/2 just make a very simple optimization, reducing
calls to page_address() in kexec_page_alloc().

v1 -> v2
- Patch " arm64: trans_pgd: fix incorrect use of pmd_populate_kernel in
copy_pte()"
restore the usage of pmd_populate_kernel.

v1 link:
https://patchwork.kernel.org/project/linux-arm-kernel/patch/[email protected]/

Rongwei Wang (2):
arm64: trans_pgd: fix incorrect use of pmd_populate_kernel in
copy_pte()
arm64: kexec: reduce calls to page_address()

arch/arm64/kernel/machine_kexec.c | 6 ++++--
arch/arm64/mm/trans_pgd.c | 2 +-
2 files changed, 5 insertions(+), 3 deletions(-)

--
2.27.0


2021-11-14 20:17:16

by Rongwei Wang

[permalink] [raw]
Subject: [PATCH v2 1/2] arm64: trans_pgd: fix incorrect use of pmd_populate_kernel in copy_pte()

In commit 5de59884ac0e ("arm64: trans_pgd: pass NULL instead
of init_mm to *_populate functions"), simply replace init_mm
with NULL for pmd_populate_kernel. But in commit 59511cfd08f3
("arm64: mm: use XN table mapping attributes for user/kernel
mappings"), adding the check of mm context in
pmd_populate_kernel. And these changes will cause a crash when
executing copy_pte/trans_pgd.c, as follows:

kernel BUG at arch/arm64/include/asm/pgalloc.h:79!
Internal error: Oops - BUG: 0 [#1] SMP
Modules linked in: rfkill(E) aes_ce_blk(E) aes_ce_cipher(E) ...
CPU: 21 PID: 1617 Comm: a.out Kdump: loaded Tainted: ... 5.15.0-rc7-mm1+ #8
Hardware name: ECS, BIOS 0.0.0 02/06/2015
pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : trans_pgd_create_copy+0x4ac/0x4f0
lr : trans_pgd_create_copy+0x34c/0x4f0
sp : ffff80001bf2bc50
x29: ffff80001bf2bc50 x28: ffff0010067f1000 x27: ffff800011072000
x26: ffff001fffff8000 x25: ffff008000000000 x24: 0040000000000041
x23: 0040000000000001 x22: ffff80001bf2bd68 x21: ffff80001188ded8
x20: ffff800000000000 x19: ffff000000000000 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: 00000000200004c0
x14: ffff00003fffffff x13: ffff007fffffffff x12: ffff800010f882a8
x11: 0000000000face57 x10: 0000000000000001 x9 : 0000000000000000
x8 : ffff00100cece000 x7 : ffff001001c9f000 x6 : ffff00100ae40000
x5 : 0000000000000040 x4 : 0000000000000000 x3 : ffff001fffff7000
x2 : ffff000000200000 x1 : ffff000040000000 x0 : ffff00100cecd000
Call trace:
trans_pgd_create_copy+0x4ac/0x4f0
machine_kexec_post_load+0x94/0x3bc
do_kexec_load+0x11c/0x2e0
__arm64_sys_kexec_load+0xa8/0xf4
invoke_syscall+0x50/0x120
el0_svc_common.constprop.0+0x58/0x190
do_el0_svc+0x2c/0x90
el0_svc+0x28/0xe0
el0t_64_sync_handler+0xb0/0xb4
el0t_64_sync+0x180/0x184
Code: f90000c0 d5033a9f d5033fdf 17ffff7b (d4210000)
---[ end trace cc5461ffe1a085db ]---
Kernel panic - not syncing: Oops - BUG: Fatal exception

This bug can be reproduced by a user case:

void execute_kexec_load(void)
{
syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);

*(uint64_t*)0x200004c0 = 0;
*(uint64_t*)0x200004c8 = 0;
*(uint64_t*)0x200004d0 = 0;
*(uint64_t*)0x200004d8 = 0;
syscall(__NR_kexec_load, 0ul, 1ul, 0x200004c0ul, 0ul);
}

And this patch make a simple change, just restoring init_mm
for 'pmd_populate_kernel' in 'copy_pte'.

Fixes: 59511cfd08f3 ("arm64: mm: use XN table mapping attributes for user/kernel mappings")
Reported-by: Abaci <[email protected]>
Signed-off-by: Rongwei Wang <[email protected]>
---
arch/arm64/mm/trans_pgd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c
index d7da8ca40d2e..5275ca312360 100644
--- a/arch/arm64/mm/trans_pgd.c
+++ b/arch/arm64/mm/trans_pgd.c
@@ -67,7 +67,7 @@ static int copy_pte(struct trans_pgd_info *info, pmd_t *dst_pmdp,
dst_ptep = trans_alloc(info);
if (!dst_ptep)
return -ENOMEM;
- pmd_populate_kernel(NULL, dst_pmdp, dst_ptep);
+ pmd_populate_kernel(&init_mm, dst_pmdp, dst_ptep);
dst_ptep = pte_offset_kernel(dst_pmdp, start);

src_ptep = pte_offset_kernel(src_pmdp, start);
--
2.27.0


2021-11-14 20:17:22

by Rongwei Wang

[permalink] [raw]
Subject: [PATCH v2 2/2] arm64: kexec: reduce calls to page_address()

In kexec_page_alloc(), page_address() is called twice.
This patch add a new variable to help to reduce calls
to page_address().

Signed-off-by: Rongwei Wang <[email protected]>
---
arch/arm64/kernel/machine_kexec.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index 1038494135c8..7f2530bcd42e 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -104,13 +104,15 @@ static void *kexec_page_alloc(void *arg)
{
struct kimage *kimage = (struct kimage *)arg;
struct page *page = kimage_alloc_control_pages(kimage, 0);
+ void *vaddr = NULL;

if (!page)
return NULL;

- memset(page_address(page), 0, PAGE_SIZE);
+ vaddr = page_address(page);
+ memset(vaddr, 0, PAGE_SIZE);

- return page_address(page);
+ return vaddr;
}

int machine_kexec_post_load(struct kimage *kimage)
--
2.27.0


2021-11-19 03:04:55

by Rongwei Wang

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] arm64: trans_pgd: fix incorrect use of pmd_populate_kernel in copy_pte()

Hi Ard and Pingfan

This bug I had fix and send out Patch v1 in 31 Oct. And it seems my
patch had been ignored, and Pingfan's patch had been reviewed:

This is link of Pingfan's patch:

https://lore.kernel.org/linux-arm-kernel/[email protected]/#r

It seems that I am not subscribe linux-arm-kernel successfully.

Thanks!

On 11/15/21 4:16 AM, Rongwei Wang wrote:
> In commit 5de59884ac0e ("arm64: trans_pgd: pass NULL instead
> of init_mm to *_populate functions"), simply replace init_mm
> with NULL for pmd_populate_kernel. But in commit 59511cfd08f3
> ("arm64: mm: use XN table mapping attributes for user/kernel
> mappings"), adding the check of mm context in
> pmd_populate_kernel. And these changes will cause a crash when
> executing copy_pte/trans_pgd.c, as follows:
>
> kernel BUG at arch/arm64/include/asm/pgalloc.h:79!
> Internal error: Oops - BUG: 0 [#1] SMP
> Modules linked in: rfkill(E) aes_ce_blk(E) aes_ce_cipher(E) ...
> CPU: 21 PID: 1617 Comm: a.out Kdump: loaded Tainted: ... 5.15.0-rc7-mm1+ #8
> Hardware name: ECS, BIOS 0.0.0 02/06/2015
> pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : trans_pgd_create_copy+0x4ac/0x4f0
> lr : trans_pgd_create_copy+0x34c/0x4f0
> sp : ffff80001bf2bc50
> x29: ffff80001bf2bc50 x28: ffff0010067f1000 x27: ffff800011072000
> x26: ffff001fffff8000 x25: ffff008000000000 x24: 0040000000000041
> x23: 0040000000000001 x22: ffff80001bf2bd68 x21: ffff80001188ded8
> x20: ffff800000000000 x19: ffff000000000000 x18: 0000000000000000
> x17: 0000000000000000 x16: 0000000000000000 x15: 00000000200004c0
> x14: ffff00003fffffff x13: ffff007fffffffff x12: ffff800010f882a8
> x11: 0000000000face57 x10: 0000000000000001 x9 : 0000000000000000
> x8 : ffff00100cece000 x7 : ffff001001c9f000 x6 : ffff00100ae40000
> x5 : 0000000000000040 x4 : 0000000000000000 x3 : ffff001fffff7000
> x2 : ffff000000200000 x1 : ffff000040000000 x0 : ffff00100cecd000
> Call trace:
> trans_pgd_create_copy+0x4ac/0x4f0
> machine_kexec_post_load+0x94/0x3bc
> do_kexec_load+0x11c/0x2e0
> __arm64_sys_kexec_load+0xa8/0xf4
> invoke_syscall+0x50/0x120
> el0_svc_common.constprop.0+0x58/0x190
> do_el0_svc+0x2c/0x90
> el0_svc+0x28/0xe0
> el0t_64_sync_handler+0xb0/0xb4
> el0t_64_sync+0x180/0x184
> Code: f90000c0 d5033a9f d5033fdf 17ffff7b (d4210000)
> ---[ end trace cc5461ffe1a085db ]---
> Kernel panic - not syncing: Oops - BUG: Fatal exception
>
> This bug can be reproduced by a user case:
>
> void execute_kexec_load(void)
> {
> syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
> syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
> syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
>
> *(uint64_t*)0x200004c0 = 0;
> *(uint64_t*)0x200004c8 = 0;
> *(uint64_t*)0x200004d0 = 0;
> *(uint64_t*)0x200004d8 = 0;
> syscall(__NR_kexec_load, 0ul, 1ul, 0x200004c0ul, 0ul);
> }
>
> And this patch make a simple change, just restoring init_mm
> for 'pmd_populate_kernel' in 'copy_pte'.
>
> Fixes: 59511cfd08f3 ("arm64: mm: use XN table mapping attributes for user/kernel mappings")
> Reported-by: Abaci <[email protected]>
> Signed-off-by: Rongwei Wang <[email protected]>
> ---
> arch/arm64/mm/trans_pgd.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c
> index d7da8ca40d2e..5275ca312360 100644
> --- a/arch/arm64/mm/trans_pgd.c
> +++ b/arch/arm64/mm/trans_pgd.c
> @@ -67,7 +67,7 @@ static int copy_pte(struct trans_pgd_info *info, pmd_t *dst_pmdp,
> dst_ptep = trans_alloc(info);
> if (!dst_ptep)
> return -ENOMEM;
> - pmd_populate_kernel(NULL, dst_pmdp, dst_ptep);
> + pmd_populate_kernel(&init_mm, dst_pmdp, dst_ptep);
> dst_ptep = pte_offset_kernel(dst_pmdp, start);
>
> src_ptep = pte_offset_kernel(src_pmdp, start);
>

2021-11-25 17:08:08

by Rongwei Wang

[permalink] [raw]
Subject: [PATCH v3 0/2] simple optimizations for page_address and

Hello

Patch 1/2 mainly to fix the bug when calling kexec_load() originally, but
because of ignored and link[1] also fixed this bug and had been applied
to arm64 (for-next/fixes) before us. Anyway, It's nice to unify the use of
'pmd_populate_kernel' under arm64.

Patch 2/2 just make a very simple optimization, reducing
calls to page_address() in kexec_page_alloc().

v2 -> v3:
- Patch "arm64: trans_pgd: unify the use of pmd_populate_kernel"
rename this patch.

v1 -> v2:
- Patch " arm64: trans_pgd: fix incorrect use of pmd_populate_kernel in
copy_pte()"
restore the usage of pmd_populate_kernel.

link1:
https://lore.kernel.org/linux-arm-kernel/[email protected]/

v1:
https://lore.kernel.org/linux-arm-kernel/[email protected]/
v2:
https://lore.kernel.org/linux-arm-kernel/[email protected]/

Rongwei Wang (2):
arm64: trans_pgd: unify the use of pmd_populate_kernel
arm64: kexec: reduce calls to page_address()

arch/arm64/kernel/machine_kexec.c | 6 ++++--
arch/arm64/mm/trans_pgd.c | 2 +-
2 files changed, 5 insertions(+), 3 deletions(-)

--
1.8.3.1


2021-11-25 17:08:10

by Rongwei Wang

[permalink] [raw]
Subject: [PATCH v3 1/2] arm64: trans_pgd: unify the use of pmd_populate_kernel

In commit 5de59884ac0e ("arm64: trans_pgd: pass NULL instead
of init_mm to *_populate functions"), simply replace init_mm
with NULL for pmd_populate_kernel. But in commit 59511cfd08f3
("arm64: mm: use XN table mapping attributes for user/kernel
mappings"), adding the check of mm context in
'pmd_populate_kernel'. And 'pmd_populate_kernel' always
called with init_mm under arm64, nice to unify its use.

This patch make a simple change, just restoring init_mm for
'pmd_populate_kernel' in 'copy_pte'.

Suggested-by: Abaci <[email protected]>
Signed-off-by: Rongwei Wang <[email protected]>
---
arch/arm64/mm/trans_pgd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c
index d7da8ca..5275ca3 100644
--- a/arch/arm64/mm/trans_pgd.c
+++ b/arch/arm64/mm/trans_pgd.c
@@ -67,7 +67,7 @@ static int copy_pte(struct trans_pgd_info *info, pmd_t *dst_pmdp,
dst_ptep = trans_alloc(info);
if (!dst_ptep)
return -ENOMEM;
- pmd_populate_kernel(NULL, dst_pmdp, dst_ptep);
+ pmd_populate_kernel(&init_mm, dst_pmdp, dst_ptep);
dst_ptep = pte_offset_kernel(dst_pmdp, start);

src_ptep = pte_offset_kernel(src_pmdp, start);
--
1.8.3.1


2021-11-25 17:08:22

by Rongwei Wang

[permalink] [raw]
Subject: [PATCH v3 2/2] arm64: kexec: reduce calls to page_address()

In kexec_page_alloc(), page_address() is called twice.
This patch add a new variable to help to reduce calls
to page_address().

Signed-off-by: Rongwei Wang <[email protected]>
---
arch/arm64/kernel/machine_kexec.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index 1038494..7f2530b 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -104,13 +104,15 @@ static void *kexec_page_alloc(void *arg)
{
struct kimage *kimage = (struct kimage *)arg;
struct page *page = kimage_alloc_control_pages(kimage, 0);
+ void *vaddr = NULL;

if (!page)
return NULL;

- memset(page_address(page), 0, PAGE_SIZE);
+ vaddr = page_address(page);
+ memset(vaddr, 0, PAGE_SIZE);

- return page_address(page);
+ return vaddr;
}

int machine_kexec_post_load(struct kimage *kimage)
--
1.8.3.1


2021-12-10 18:40:51

by Catalin Marinas

[permalink] [raw]
Subject: Re: (subset) [PATCH v3 0/2] simple optimizations for page_address and

On Fri, 26 Nov 2021 01:05:58 +0800, Rongwei Wang wrote:
> Patch 1/2 mainly to fix the bug when calling kexec_load() originally, but
> because of ignored and link[1] also fixed this bug and had been applied
> to arm64 (for-next/fixes) before us. Anyway, It's nice to unify the use of
> 'pmd_populate_kernel' under arm64.
>
> Patch 2/2 just make a very simple optimization, reducing
> calls to page_address() in kexec_page_alloc().
>
> [...]

Applied to arm64 (for-next/misc), thanks!

[2/2] arm64: kexec: reduce calls to page_address()
https://git.kernel.org/arm64/c/7afccde389dc

--
Catalin