2022-04-21 09:08:47

by Chen Zhongjin

[permalink] [raw]
Subject: [PATCH 5.10] fix csdlock_debug cause arm64 boot panic

csdlock_debug is a early_param to enable csd_lock_wait
feature.

It uses static_branch_enable in early_param which triggers
a panic on arm64 with config:
CONFIG_SPARSEMEM=y
CONFIG_SPARSEMEM_VMEMMAP=n

The log shows:
Unable to handle kernel NULL pointer dereference at
virtual address ", '0' <repeats 16 times>, "
...
Call trace:
__aarch64_insn_write+0x9c/0x18c
...
static_key_enable+0x1c/0x30
csdlock_debug+0x4c/0x78
do_early_param+0x9c/0xcc
parse_args+0x26c/0x3a8
parse_early_options+0x34/0x40
parse_early_param+0x80/0xa4
setup_arch+0x150/0x6c8
start_kernel+0x8c/0x720
...
Kernel panic - not syncing: Oops: Fatal exception

Call trace inside __aarch64_insn_write:
__nr_to_section
__pfn_to_page
phys_to_page
patch_map
__aarch64_insn_write

Here, with CONFIG_SPARSEMEM_VMEMMAP=n, __nr_to_section returns
NULL and makes the NULL dereference because mem_section is
initialized in sparse_init after parse_early_param stage.

So, static_branch_enable shouldn't be used inside early_param.
To avoid this, I changed it to __setup and fixed this.

Reported-by: Chen jingwen <[email protected]>
Signed-off-by: Chen Zhongjin <[email protected]>
---
kernel/smp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index 65a630f62363..1ce64de460d0 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -176,7 +176,7 @@ static int __init csdlock_debug(char *str)

return 0;
}
-early_param("csdlock_debug", csdlock_debug);
+__setup("csdlock_debug=", csdlock_debug);

static DEFINE_PER_CPU(call_single_data_t *, cur_csd);
static DEFINE_PER_CPU(smp_call_func_t, cur_csd_func);
--
2.17.1


2022-04-21 18:05:29

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH 5.10] fix csdlock_debug cause arm64 boot panic

Hi--

On 4/20/22 20:39, Chen Zhongjin wrote:
> csdlock_debug is a early_param to enable csd_lock_wait
> feature.
>
> It uses static_branch_enable in early_param which triggers
> a panic on arm64 with config:
> CONFIG_SPARSEMEM=y
> CONFIG_SPARSEMEM_VMEMMAP=n
>
> The log shows:
> Unable to handle kernel NULL pointer dereference at
> virtual address ", '0' <repeats 16 times>, "
> ...
> Call trace:
> __aarch64_insn_write+0x9c/0x18c
> ...
> static_key_enable+0x1c/0x30
> csdlock_debug+0x4c/0x78
> do_early_param+0x9c/0xcc
> parse_args+0x26c/0x3a8
> parse_early_options+0x34/0x40
> parse_early_param+0x80/0xa4
> setup_arch+0x150/0x6c8
> start_kernel+0x8c/0x720
> ...
> Kernel panic - not syncing: Oops: Fatal exception
>
> Call trace inside __aarch64_insn_write:
> __nr_to_section
> __pfn_to_page
> phys_to_page
> patch_map
> __aarch64_insn_write
>
> Here, with CONFIG_SPARSEMEM_VMEMMAP=n, __nr_to_section returns
> NULL and makes the NULL dereference because mem_section is
> initialized in sparse_init after parse_early_param stage.
>
> So, static_branch_enable shouldn't be used inside early_param.
> To avoid this, I changed it to __setup and fixed this.
>
> Reported-by: Chen jingwen <[email protected]>
> Signed-off-by: Chen Zhongjin <[email protected]>
> ---
> kernel/smp.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/smp.c b/kernel/smp.c
> index 65a630f62363..1ce64de460d0 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -176,7 +176,7 @@ static int __init csdlock_debug(char *str)
>
> return 0;

^^^ This should be
return 1;

since __setup() functions return 1 on success -- opposite of
early_param() return values.

> }
> -early_param("csdlock_debug", csdlock_debug);
> +__setup("csdlock_debug=", csdlock_debug);
>
> static DEFINE_PER_CPU(call_single_data_t *, cur_csd);
> static DEFINE_PER_CPU(smp_call_func_t, cur_csd_func);

Thanks.

--
~Randy

2022-04-22 05:19:39

by Chen Zhongjin

[permalink] [raw]
Subject: Re: [PATCH 5.10] fix csdlock_debug cause arm64 boot panic

Hi,

On 2022/4/21 12:08, Randy Dunlap wrote:
> Hi--
>
> On 4/20/22 20:39, Chen Zhongjin wrote:
>> ---
>> kernel/smp.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/smp.c b/kernel/smp.c
>> index 65a630f62363..1ce64de460d0 100644
>> --- a/kernel/smp.c
>> +++ b/kernel/smp.c
>> @@ -176,7 +176,7 @@ static int __init csdlock_debug(char *str)
>>
>> return 0;
>
> ^^^ This should be
> return 1;
>
> since __setup() functions return 1 on success -- opposite of
> early_param() return values.
>

Fixed in v2.

By the way, below patch forced to open CONFIG_SPARSEMEM_VMEMMAP on arm64
from 5.12-rc3. By this __page_to_pfn won't call __nr_to_section and
causes this bug.

https://lore.kernel.org/all/[email protected]/

So this patch is only applied to 5.10-LTS.

>> }
>> -early_param("csdlock_debug", csdlock_debug);
>> +__setup("csdlock_debug=", csdlock_debug);
>>
>> static DEFINE_PER_CPU(call_single_data_t *, cur_csd);
>> static DEFINE_PER_CPU(smp_call_func_t, cur_csd_func);
>
> Thanks.
>

Thanks!