2020-06-30 13:54:54

by Zenghui Yu

[permalink] [raw]
Subject: [PATCH] irqchip/gic-v4.1: Use GFP_ATOMIC flag in allocate_vpe_l1_table()

Booting the latest kernel with DEBUG_ATOMIC_SLEEP=y on a GICv4.1 enabled
box, I get the following kernel splat:

[ 0.053766] BUG: sleeping function called from invalid context at mm/slab.h:567
[ 0.053767] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/1
[ 0.053769] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.8.0-rc3+ #23
[ 0.053770] Call trace:
[ 0.053774] dump_backtrace+0x0/0x218
[ 0.053775] show_stack+0x2c/0x38
[ 0.053777] dump_stack+0xc4/0x10c
[ 0.053779] ___might_sleep+0xfc/0x140
[ 0.053780] __might_sleep+0x58/0x90
[ 0.053782] slab_pre_alloc_hook+0x7c/0x90
[ 0.053783] kmem_cache_alloc_trace+0x60/0x2f0
[ 0.053785] its_cpu_init+0x6f4/0xe40
[ 0.053786] gic_starting_cpu+0x24/0x38
[ 0.053788] cpuhp_invoke_callback+0xa0/0x710
[ 0.053789] notify_cpu_starting+0xcc/0xd8
[ 0.053790] secondary_start_kernel+0x148/0x200

# ./scripts/faddr2line vmlinux its_cpu_init+0x6f4/0xe40
its_cpu_init+0x6f4/0xe40:
allocate_vpe_l1_table at drivers/irqchip/irq-gic-v3-its.c:2818
(inlined by) its_cpu_init_lpis at drivers/irqchip/irq-gic-v3-its.c:3138
(inlined by) its_cpu_init at drivers/irqchip/irq-gic-v3-its.c:5166

It turned out that we're allocating memory using GFP_KERNEL (may sleep)
within the CPU hotplug notifier, which is indeed an atomic context. Bad
thing may happen if we're playing on a system with more than a single
CommonLPIAff group. Avoid it by turning this into an atomic allocation.

Fixes: 5e5168461c22 ("irqchip/gic-v4.1: VPE table (aka GICR_VPROPBASER) allocation")
Signed-off-by: Zenghui Yu <[email protected]>
---
drivers/irqchip/irq-gic-v3-its.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 6a5a87fc4601..b66eeca442c4 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -2814,7 +2814,7 @@ static int allocate_vpe_l1_table(void)
if (val & GICR_VPROPBASER_4_1_VALID)
goto out;

- gic_data_rdist()->vpe_table_mask = kzalloc(sizeof(cpumask_t), GFP_KERNEL);
+ gic_data_rdist()->vpe_table_mask = kzalloc(sizeof(cpumask_t), GFP_ATOMIC);
if (!gic_data_rdist()->vpe_table_mask)
return -ENOMEM;

@@ -2881,7 +2881,7 @@ static int allocate_vpe_l1_table(void)

pr_debug("np = %d, npg = %lld, psz = %d, epp = %d, esz = %d\n",
np, npg, psz, epp, esz);
- page = alloc_pages(GFP_KERNEL | __GFP_ZERO, get_order(np * PAGE_SIZE));
+ page = alloc_pages(GFP_ATOMIC | __GFP_ZERO, get_order(np * PAGE_SIZE));
if (!page)
return -ENOMEM;

--
2.19.1


2020-07-27 03:52:26

by Zenghui Yu

[permalink] [raw]
Subject: Re: [PATCH] irqchip/gic-v4.1: Use GFP_ATOMIC flag in allocate_vpe_l1_table()

Hi Marc,

On 2020/6/30 21:37, Zenghui Yu wrote:
> Booting the latest kernel with DEBUG_ATOMIC_SLEEP=y on a GICv4.1 enabled
> box, I get the following kernel splat:
>
> [ 0.053766] BUG: sleeping function called from invalid context at mm/slab.h:567
> [ 0.053767] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/1
> [ 0.053769] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.8.0-rc3+ #23
> [ 0.053770] Call trace:
> [ 0.053774] dump_backtrace+0x0/0x218
> [ 0.053775] show_stack+0x2c/0x38
> [ 0.053777] dump_stack+0xc4/0x10c
> [ 0.053779] ___might_sleep+0xfc/0x140
> [ 0.053780] __might_sleep+0x58/0x90
> [ 0.053782] slab_pre_alloc_hook+0x7c/0x90
> [ 0.053783] kmem_cache_alloc_trace+0x60/0x2f0
> [ 0.053785] its_cpu_init+0x6f4/0xe40
> [ 0.053786] gic_starting_cpu+0x24/0x38
> [ 0.053788] cpuhp_invoke_callback+0xa0/0x710
> [ 0.053789] notify_cpu_starting+0xcc/0xd8
> [ 0.053790] secondary_start_kernel+0x148/0x200
>
> # ./scripts/faddr2line vmlinux its_cpu_init+0x6f4/0xe40
> its_cpu_init+0x6f4/0xe40:
> allocate_vpe_l1_table at drivers/irqchip/irq-gic-v3-its.c:2818
> (inlined by) its_cpu_init_lpis at drivers/irqchip/irq-gic-v3-its.c:3138
> (inlined by) its_cpu_init at drivers/irqchip/irq-gic-v3-its.c:5166
>
> It turned out that we're allocating memory using GFP_KERNEL (may sleep)
> within the CPU hotplug notifier, which is indeed an atomic context. Bad
> thing may happen if we're playing on a system with more than a single
> CommonLPIAff group. Avoid it by turning this into an atomic allocation.
>
> Fixes: 5e5168461c22 ("irqchip/gic-v4.1: VPE table (aka GICR_VPROPBASER) allocation")
> Signed-off-by: Zenghui Yu <[email protected]>
> ---
> drivers/irqchip/irq-gic-v3-its.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 6a5a87fc4601..b66eeca442c4 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -2814,7 +2814,7 @@ static int allocate_vpe_l1_table(void)
> if (val & GICR_VPROPBASER_4_1_VALID)
> goto out;
>
> - gic_data_rdist()->vpe_table_mask = kzalloc(sizeof(cpumask_t), GFP_KERNEL);
> + gic_data_rdist()->vpe_table_mask = kzalloc(sizeof(cpumask_t), GFP_ATOMIC);
> if (!gic_data_rdist()->vpe_table_mask)
> return -ENOMEM;
>
> @@ -2881,7 +2881,7 @@ static int allocate_vpe_l1_table(void)
>
> pr_debug("np = %d, npg = %lld, psz = %d, epp = %d, esz = %d\n",
> np, npg, psz, epp, esz);
> - page = alloc_pages(GFP_KERNEL | __GFP_ZERO, get_order(np * PAGE_SIZE));
> + page = alloc_pages(GFP_ATOMIC | __GFP_ZERO, get_order(np * PAGE_SIZE));
> if (!page)
> return -ENOMEM;
>

Do you mind taking this patch into v5.9? Or please let me know if you
still have any concerns on it?


Thanks,
Zenghui

2020-07-27 07:59:05

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH] irqchip/gic-v4.1: Use GFP_ATOMIC flag in allocate_vpe_l1_table()

Hi Zenghui,

On 2020-07-27 04:50, Zenghui Yu wrote:
> Hi Marc,
>
> On 2020/6/30 21:37, Zenghui Yu wrote:
>> Booting the latest kernel with DEBUG_ATOMIC_SLEEP=y on a GICv4.1
>> enabled
>> box, I get the following kernel splat:
>>
>> [ 0.053766] BUG: sleeping function called from invalid context at
>> mm/slab.h:567
>> [ 0.053767] in_atomic(): 1, irqs_disabled(): 128, non_block: 0,
>> pid: 0, name: swapper/1
>> [ 0.053769] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.8.0-rc3+
>> #23
>> [ 0.053770] Call trace:
>> [ 0.053774] dump_backtrace+0x0/0x218
>> [ 0.053775] show_stack+0x2c/0x38
>> [ 0.053777] dump_stack+0xc4/0x10c
>> [ 0.053779] ___might_sleep+0xfc/0x140
>> [ 0.053780] __might_sleep+0x58/0x90
>> [ 0.053782] slab_pre_alloc_hook+0x7c/0x90
>> [ 0.053783] kmem_cache_alloc_trace+0x60/0x2f0
>> [ 0.053785] its_cpu_init+0x6f4/0xe40
>> [ 0.053786] gic_starting_cpu+0x24/0x38
>> [ 0.053788] cpuhp_invoke_callback+0xa0/0x710
>> [ 0.053789] notify_cpu_starting+0xcc/0xd8
>> [ 0.053790] secondary_start_kernel+0x148/0x200
>>
>> # ./scripts/faddr2line vmlinux its_cpu_init+0x6f4/0xe40
>> its_cpu_init+0x6f4/0xe40:
>> allocate_vpe_l1_table at drivers/irqchip/irq-gic-v3-its.c:2818
>> (inlined by) its_cpu_init_lpis at
>> drivers/irqchip/irq-gic-v3-its.c:3138
>> (inlined by) its_cpu_init at drivers/irqchip/irq-gic-v3-its.c:5166
>>
>> It turned out that we're allocating memory using GFP_KERNEL (may
>> sleep)
>> within the CPU hotplug notifier, which is indeed an atomic context.
>> Bad
>> thing may happen if we're playing on a system with more than a single
>> CommonLPIAff group. Avoid it by turning this into an atomic
>> allocation.
>>
>> Fixes: 5e5168461c22 ("irqchip/gic-v4.1: VPE table (aka
>> GICR_VPROPBASER) allocation")
>> Signed-off-by: Zenghui Yu <[email protected]>
>> ---
>> drivers/irqchip/irq-gic-v3-its.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/irqchip/irq-gic-v3-its.c
>> b/drivers/irqchip/irq-gic-v3-its.c
>> index 6a5a87fc4601..b66eeca442c4 100644
>> --- a/drivers/irqchip/irq-gic-v3-its.c
>> +++ b/drivers/irqchip/irq-gic-v3-its.c
>> @@ -2814,7 +2814,7 @@ static int allocate_vpe_l1_table(void)
>> if (val & GICR_VPROPBASER_4_1_VALID)
>> goto out;
>> - gic_data_rdist()->vpe_table_mask = kzalloc(sizeof(cpumask_t),
>> GFP_KERNEL);
>> + gic_data_rdist()->vpe_table_mask = kzalloc(sizeof(cpumask_t),
>> GFP_ATOMIC);
>> if (!gic_data_rdist()->vpe_table_mask)
>> return -ENOMEM;
>> @@ -2881,7 +2881,7 @@ static int allocate_vpe_l1_table(void)
>> pr_debug("np = %d, npg = %lld, psz = %d, epp = %d, esz = %d\n",
>> np, npg, psz, epp, esz);
>> - page = alloc_pages(GFP_KERNEL | __GFP_ZERO, get_order(np *
>> PAGE_SIZE));
>> + page = alloc_pages(GFP_ATOMIC | __GFP_ZERO, get_order(np *
>> PAGE_SIZE));
>> if (!page)
>> return -ENOMEM;
>>
>
> Do you mind taking this patch into v5.9? Or please let me know if you
> still have any concerns on it?

Oops, I seem to have dropped this one on the floor.
I've picked it up now.

Thanks for the heads up,

M.
--
Jazz is not dead. It just smells funny...

2020-07-27 08:15:44

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH] irqchip/gic-v4.1: Use GFP_ATOMIC flag in allocate_vpe_l1_table()

On Tue, 30 Jun 2020 21:37:46 +0800, Zenghui Yu wrote:
> Booting the latest kernel with DEBUG_ATOMIC_SLEEP=y on a GICv4.1 enabled
> box, I get the following kernel splat:
>
> [ 0.053766] BUG: sleeping function called from invalid context at mm/slab.h:567
> [ 0.053767] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/1
> [ 0.053769] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.8.0-rc3+ #23
> [ 0.053770] Call trace:
> [ 0.053774] dump_backtrace+0x0/0x218
> [ 0.053775] show_stack+0x2c/0x38
> [ 0.053777] dump_stack+0xc4/0x10c
> [ 0.053779] ___might_sleep+0xfc/0x140
> [ 0.053780] __might_sleep+0x58/0x90
> [ 0.053782] slab_pre_alloc_hook+0x7c/0x90
> [ 0.053783] kmem_cache_alloc_trace+0x60/0x2f0
> [ 0.053785] its_cpu_init+0x6f4/0xe40
> [ 0.053786] gic_starting_cpu+0x24/0x38
> [ 0.053788] cpuhp_invoke_callback+0xa0/0x710
> [ 0.053789] notify_cpu_starting+0xcc/0xd8
> [ 0.053790] secondary_start_kernel+0x148/0x200
>
> [...]

Applied to irq/irqchip-next, thanks!

[1/1] irqchip/gic-v4.1: Use GFP_ATOMIC flag in allocate_vpe_l1_table()
commit: d1bd7e0ba533a2a6f313579ec9b504f6614c35c4

Cheers,

M.
--
Without deviation from the norm, progress is not possible.