2021-11-01 10:34:19

by Jun Miao

[permalink] [raw]
Subject: [PATCH] rcu: avoid alloc_pages() when recording stack

The default kasan_record_aux_stack() calls stack_depot_save() with GFP_NOWAIT,
which in turn can then call alloc_pages(GFP_NOWAIT, ...). In general, however,
it is not even possible to use either GFP_ATOMIC nor GFP_NOWAIT in certain
non-preemptive contexts/RT kernel including raw_spin_locks (see gfp.h and ab00db216c9c7).

Fix it by instructing stackdepot to not expand stack storage via alloc_pages()
in case it runs out by using kasan_record_aux_stack_noalloc().

Jianwei Hu reported:
BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:969
in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 15319, name: python3
INFO: lockdep is turned off.
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffffffff856c8b13>] copy_process+0xaf3/0x2590
softirqs last enabled at (0): [<ffffffff856c8b13>] copy_process+0xaf3/0x2590
softirqs last disabled at (0): [<0000000000000000>] 0x0
CPU: 6 PID: 15319 Comm: python3 Tainted: G W O 5.15-rc7-preempt-rt #1
Hardware name: Supermicro SYS-E300-9A-8C/A2SDi-8C-HLN4F, BIOS 1.1b 12/17/2018
Call Trace:
show_stack+0x52/0x58
dump_stack+0xa1/0xd6
___might_sleep.cold+0x11c/0x12d
rt_spin_lock+0x3f/0xc0
rmqueue+0x100/0x1460
rmqueue+0x100/0x1460
mark_usage+0x1a0/0x1a0
ftrace_graph_ret_addr+0x2a/0xb0
rmqueue_pcplist.constprop.0+0x6a0/0x6a0
__kasan_check_read+0x11/0x20
__zone_watermark_ok+0x114/0x270
get_page_from_freelist+0x148/0x630
is_module_text_address+0x32/0xa0
__alloc_pages_nodemask+0x2f6/0x790
__alloc_pages_slowpath.constprop.0+0x12d0/0x12d0
create_prof_cpu_mask+0x30/0x30
alloc_pages_current+0xb1/0x150
stack_depot_save+0x39f/0x490
kasan_save_stack+0x42/0x50
kasan_save_stack+0x23/0x50
kasan_record_aux_stack+0xa9/0xc0
__call_rcu+0xff/0x9c0
call_rcu+0xe/0x10
put_object+0x53/0x70
__delete_object+0x7b/0x90
kmemleak_free+0x46/0x70
slab_free_freelist_hook+0xb4/0x160
kfree+0xe5/0x420
kfree_const+0x17/0x30
kobject_cleanup+0xaa/0x230
kobject_put+0x76/0x90
netdev_queue_update_kobjects+0x17d/0x1f0
... ...
ksys_write+0xd9/0x180
__x64_sys_write+0x42/0x50
do_syscall_64+0x38/0x50
entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 84109ab58590 ("rcu: Record kvfree_call_rcu() call stack for KASAN")
Fixes: 26e760c9a7c8 ("rcu: kasan: record and print call_rcu() call stack")
Reported-by: Jianwei Hu <[email protected]>
Signed-off-by: Jun Miao <[email protected]>
---
kernel/rcu/tree.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 8270e58cd0f3..2c1034580f15 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3026,7 +3026,7 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func)
head->func = func;
head->next = NULL;
local_irq_save(flags);
- kasan_record_aux_stack(head);
+ kasan_record_aux_stack_noalloc(head);
rdp = this_cpu_ptr(&rcu_data);

/* Add the callback to our list. */
@@ -3591,7 +3591,7 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
return;
}

- kasan_record_aux_stack(ptr);
+ kasan_record_aux_stack_noalloc(ptr);
success = add_ptr_to_bulk_krc_lock(&krcp, &flags, ptr, !head);
if (!success) {
run_page_cache_worker(krcp);
--
2.33.0


2021-11-02 04:24:59

by Jun Miao

[permalink] [raw]
Subject: Re: [PATCH] rcu: avoid alloc_pages() when recording stack

Add KASAN maintainers

On 11/1/21 6:31 PM, Jun Miao wrote:
> The default kasan_record_aux_stack() calls stack_depot_save() with GFP_NOWAIT,
> which in turn can then call alloc_pages(GFP_NOWAIT, ...). In general, however,
> it is not even possible to use either GFP_ATOMIC nor GFP_NOWAIT in certain
> non-preemptive contexts/RT kernel including raw_spin_locks (see gfp.h and ab00db216c9c7).
>
> Fix it by instructing stackdepot to not expand stack storage via alloc_pages()
> in case it runs out by using kasan_record_aux_stack_noalloc().
>
> Jianwei Hu reported:
> BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:969
> in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 15319, name: python3
> INFO: lockdep is turned off.
> irq event stamp: 0
> hardirqs last enabled at (0): [<0000000000000000>] 0x0
> hardirqs last disabled at (0): [<ffffffff856c8b13>] copy_process+0xaf3/0x2590
> softirqs last enabled at (0): [<ffffffff856c8b13>] copy_process+0xaf3/0x2590
> softirqs last disabled at (0): [<0000000000000000>] 0x0
> CPU: 6 PID: 15319 Comm: python3 Tainted: G W O 5.15-rc7-preempt-rt #1
> Hardware name: Supermicro SYS-E300-9A-8C/A2SDi-8C-HLN4F, BIOS 1.1b 12/17/2018
> Call Trace:
> show_stack+0x52/0x58
> dump_stack+0xa1/0xd6
> ___might_sleep.cold+0x11c/0x12d
> rt_spin_lock+0x3f/0xc0
> rmqueue+0x100/0x1460
> rmqueue+0x100/0x1460
> mark_usage+0x1a0/0x1a0
> ftrace_graph_ret_addr+0x2a/0xb0
> rmqueue_pcplist.constprop.0+0x6a0/0x6a0
> __kasan_check_read+0x11/0x20
> __zone_watermark_ok+0x114/0x270
> get_page_from_freelist+0x148/0x630
> is_module_text_address+0x32/0xa0
> __alloc_pages_nodemask+0x2f6/0x790
> __alloc_pages_slowpath.constprop.0+0x12d0/0x12d0
> create_prof_cpu_mask+0x30/0x30
> alloc_pages_current+0xb1/0x150
> stack_depot_save+0x39f/0x490
> kasan_save_stack+0x42/0x50
> kasan_save_stack+0x23/0x50
> kasan_record_aux_stack+0xa9/0xc0
> __call_rcu+0xff/0x9c0
> call_rcu+0xe/0x10
> put_object+0x53/0x70
> __delete_object+0x7b/0x90
> kmemleak_free+0x46/0x70
> slab_free_freelist_hook+0xb4/0x160
> kfree+0xe5/0x420
> kfree_const+0x17/0x30
> kobject_cleanup+0xaa/0x230
> kobject_put+0x76/0x90
> netdev_queue_update_kobjects+0x17d/0x1f0
> ... ...
> ksys_write+0xd9/0x180
> __x64_sys_write+0x42/0x50
> do_syscall_64+0x38/0x50
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> Fixes: 84109ab58590 ("rcu: Record kvfree_call_rcu() call stack for KASAN")
> Fixes: 26e760c9a7c8 ("rcu: kasan: record and print call_rcu() call stack")
> Reported-by: Jianwei Hu <[email protected]>
> Signed-off-by: Jun Miao <[email protected]>
> ---
> kernel/rcu/tree.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 8270e58cd0f3..2c1034580f15 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3026,7 +3026,7 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func)
> head->func = func;
> head->next = NULL;
> local_irq_save(flags);
> - kasan_record_aux_stack(head);
> + kasan_record_aux_stack_noalloc(head);
> rdp = this_cpu_ptr(&rcu_data);
>
> /* Add the callback to our list. */
> @@ -3591,7 +3591,7 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
> return;
> }
>
> - kasan_record_aux_stack(ptr);
> + kasan_record_aux_stack_noalloc(ptr);
> success = add_ptr_to_bulk_krc_lock(&krcp, &flags, ptr, !head);
> if (!success) {
> run_page_cache_worker(krcp);

2021-11-02 14:55:46

by Uladzislau Rezki

[permalink] [raw]
Subject: Re: [PATCH] rcu: avoid alloc_pages() when recording stack

>
> Add KASAN maintainers
>
> On 11/1/21 6:31 PM, Jun Miao wrote:
> > The default kasan_record_aux_stack() calls stack_depot_save() with GFP_NOWAIT,
> > which in turn can then call alloc_pages(GFP_NOWAIT, ...). In general, however,
> > it is not even possible to use either GFP_ATOMIC nor GFP_NOWAIT in certain
> > non-preemptive contexts/RT kernel including raw_spin_locks (see gfp.h and ab00db216c9c7).
> >
> > Fix it by instructing stackdepot to not expand stack storage via alloc_pages()
> > in case it runs out by using kasan_record_aux_stack_noalloc().
> >
> > Jianwei Hu reported:
> > BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:969
> > in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 15319, name: python3
> > INFO: lockdep is turned off.
> > irq event stamp: 0
> > hardirqs last enabled at (0): [<0000000000000000>] 0x0
> > hardirqs last disabled at (0): [<ffffffff856c8b13>] copy_process+0xaf3/0x2590
> > softirqs last enabled at (0): [<ffffffff856c8b13>] copy_process+0xaf3/0x2590
> > softirqs last disabled at (0): [<0000000000000000>] 0x0
> > CPU: 6 PID: 15319 Comm: python3 Tainted: G W O 5.15-rc7-preempt-rt #1
> > Hardware name: Supermicro SYS-E300-9A-8C/A2SDi-8C-HLN4F, BIOS 1.1b 12/17/2018
> > Call Trace:
> > show_stack+0x52/0x58
> > dump_stack+0xa1/0xd6
> > ___might_sleep.cold+0x11c/0x12d
> > rt_spin_lock+0x3f/0xc0
> > rmqueue+0x100/0x1460
> > rmqueue+0x100/0x1460
> > mark_usage+0x1a0/0x1a0
> > ftrace_graph_ret_addr+0x2a/0xb0
> > rmqueue_pcplist.constprop.0+0x6a0/0x6a0
> > __kasan_check_read+0x11/0x20
> > __zone_watermark_ok+0x114/0x270
> > get_page_from_freelist+0x148/0x630
> > is_module_text_address+0x32/0xa0
> > __alloc_pages_nodemask+0x2f6/0x790
> > __alloc_pages_slowpath.constprop.0+0x12d0/0x12d0
> > create_prof_cpu_mask+0x30/0x30
> > alloc_pages_current+0xb1/0x150
> > stack_depot_save+0x39f/0x490
> > kasan_save_stack+0x42/0x50
> > kasan_save_stack+0x23/0x50
> > kasan_record_aux_stack+0xa9/0xc0
> > __call_rcu+0xff/0x9c0
> > call_rcu+0xe/0x10
> > put_object+0x53/0x70
> > __delete_object+0x7b/0x90
> > kmemleak_free+0x46/0x70
> > slab_free_freelist_hook+0xb4/0x160
> > kfree+0xe5/0x420
> > kfree_const+0x17/0x30
> > kobject_cleanup+0xaa/0x230
> > kobject_put+0x76/0x90
> > netdev_queue_update_kobjects+0x17d/0x1f0
> > ... ...
> > ksys_write+0xd9/0x180
> > __x64_sys_write+0x42/0x50
> > do_syscall_64+0x38/0x50
> > entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> > Fixes: 84109ab58590 ("rcu: Record kvfree_call_rcu() call stack for KASAN")
> > Fixes: 26e760c9a7c8 ("rcu: kasan: record and print call_rcu() call stack")
> > Reported-by: Jianwei Hu <[email protected]>
> > Signed-off-by: Jun Miao <[email protected]>
> > ---
> > kernel/rcu/tree.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 8270e58cd0f3..2c1034580f15 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -3026,7 +3026,7 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func)
> > head->func = func;
> > head->next = NULL;
> > local_irq_save(flags);
> > - kasan_record_aux_stack(head);
> > + kasan_record_aux_stack_noalloc(head);
> > rdp = this_cpu_ptr(&rcu_data);
> >
> > /* Add the callback to our list. */
> > @@ -3591,7 +3591,7 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
> > return;
> > }
> >
> > - kasan_record_aux_stack(ptr);
> > + kasan_record_aux_stack_noalloc(ptr);
> > success = add_ptr_to_bulk_krc_lock(&krcp, &flags, ptr, !head);
> > if (!success) {
> > run_page_cache_worker(krcp);
>
Yep an allocation is tricky here. This change looks correct to me at
least from the point that it does not allocate.

--
Uladzislau Rezki

2021-11-03 06:52:47

by Jun Miao

[permalink] [raw]
Subject: Re: [PATCH] rcu: avoid alloc_pages() when recording stack


On 11/2/21 10:53 PM, Uladzislau Rezki wrote:
> [Please note: This e-mail is from an EXTERNAL e-mail address]
>
>> Add KASAN maintainers
>>
>> On 11/1/21 6:31 PM, Jun Miao wrote:
>>> The default kasan_record_aux_stack() calls stack_depot_save() with GFP_NOWAIT,
>>> which in turn can then call alloc_pages(GFP_NOWAIT, ...). In general, however,
>>> it is not even possible to use either GFP_ATOMIC nor GFP_NOWAIT in certain
>>> non-preemptive contexts/RT kernel including raw_spin_locks (see gfp.h and ab00db216c9c7).
>>>
>>> Fix it by instructing stackdepot to not expand stack storage via alloc_pages()
>>> in case it runs out by using kasan_record_aux_stack_noalloc().
>>>
>>> Jianwei Hu reported:
>>> BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:969
>>> in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 15319, name: python3
>>> INFO: lockdep is turned off.
>>> irq event stamp: 0
>>> hardirqs last enabled at (0): [<0000000000000000>] 0x0
>>> hardirqs last disabled at (0): [<ffffffff856c8b13>] copy_process+0xaf3/0x2590
>>> softirqs last enabled at (0): [<ffffffff856c8b13>] copy_process+0xaf3/0x2590
>>> softirqs last disabled at (0): [<0000000000000000>] 0x0
>>> CPU: 6 PID: 15319 Comm: python3 Tainted: G W O 5.15-rc7-preempt-rt #1
>>> Hardware name: Supermicro SYS-E300-9A-8C/A2SDi-8C-HLN4F, BIOS 1.1b 12/17/2018
>>> Call Trace:
>>> show_stack+0x52/0x58
>>> dump_stack+0xa1/0xd6
>>> ___might_sleep.cold+0x11c/0x12d
>>> rt_spin_lock+0x3f/0xc0
>>> rmqueue+0x100/0x1460
>>> rmqueue+0x100/0x1460
>>> mark_usage+0x1a0/0x1a0
>>> ftrace_graph_ret_addr+0x2a/0xb0
>>> rmqueue_pcplist.constprop.0+0x6a0/0x6a0
>>> __kasan_check_read+0x11/0x20
>>> __zone_watermark_ok+0x114/0x270
>>> get_page_from_freelist+0x148/0x630
>>> is_module_text_address+0x32/0xa0
>>> __alloc_pages_nodemask+0x2f6/0x790
>>> __alloc_pages_slowpath.constprop.0+0x12d0/0x12d0
>>> create_prof_cpu_mask+0x30/0x30
>>> alloc_pages_current+0xb1/0x150
>>> stack_depot_save+0x39f/0x490
>>> kasan_save_stack+0x42/0x50
>>> kasan_save_stack+0x23/0x50
>>> kasan_record_aux_stack+0xa9/0xc0
>>> __call_rcu+0xff/0x9c0
>>> call_rcu+0xe/0x10
>>> put_object+0x53/0x70
>>> __delete_object+0x7b/0x90
>>> kmemleak_free+0x46/0x70
>>> slab_free_freelist_hook+0xb4/0x160
>>> kfree+0xe5/0x420
>>> kfree_const+0x17/0x30
>>> kobject_cleanup+0xaa/0x230
>>> kobject_put+0x76/0x90
>>> netdev_queue_update_kobjects+0x17d/0x1f0
>>> ... ...
>>> ksys_write+0xd9/0x180
>>> __x64_sys_write+0x42/0x50
>>> do_syscall_64+0x38/0x50
>>> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>>
>>> Fixes: 84109ab58590 ("rcu: Record kvfree_call_rcu() call stack for KASAN")
>>> Fixes: 26e760c9a7c8 ("rcu: kasan: record and print call_rcu() call stack")
>>> Reported-by: Jianwei Hu <[email protected]>
>>> Signed-off-by: Jun Miao <[email protected]>
>>> ---
>>> kernel/rcu/tree.c | 4 ++--
>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
>>> index 8270e58cd0f3..2c1034580f15 100644
>>> --- a/kernel/rcu/tree.c
>>> +++ b/kernel/rcu/tree.c
>>> @@ -3026,7 +3026,7 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func)
>>> head->func = func;
>>> head->next = NULL;
>>> local_irq_save(flags);
>>> - kasan_record_aux_stack(head);
>>> + kasan_record_aux_stack_noalloc(head);
>>> rdp = this_cpu_ptr(&rcu_data);
>>>
>>> /* Add the callback to our list. */
>>> @@ -3591,7 +3591,7 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
>>> return;
>>> }
>>>
>>> - kasan_record_aux_stack(ptr);
>>> + kasan_record_aux_stack_noalloc(ptr);
>>> success = add_ptr_to_bulk_krc_lock(&krcp, &flags, ptr, !head);
>>> if (!success) {
>>> run_page_cache_worker(krcp);
> Yep an allocation is tricky here. This change looks correct to me at
> least from the point that it does not allocate.
>
> --
> Uladzislau Rezki

Thanks your approval. Could you like to give me a review?

And This fix patch is my last patch in WindRiver by using email of
[email protected]. Because i will go to intel next week.

What about your advice @Paul E

---
Jun Miao

2021-11-03 14:01:09

by Uladzislau Rezki

[permalink] [raw]
Subject: Re: [PATCH] rcu: avoid alloc_pages() when recording stack

On Wed, Nov 3, 2021 at 7:51 AM Jun Miao <[email protected]> wrote:
>
>
> On 11/2/21 10:53 PM, Uladzislau Rezki wrote:
> > [Please note: This e-mail is from an EXTERNAL e-mail address]
> >
> >> Add KASAN maintainers
> >>
> >> On 11/1/21 6:31 PM, Jun Miao wrote:
> >>> The default kasan_record_aux_stack() calls stack_depot_save() with GFP_NOWAIT,
> >>> which in turn can then call alloc_pages(GFP_NOWAIT, ...). In general, however,
> >>> it is not even possible to use either GFP_ATOMIC nor GFP_NOWAIT in certain
> >>> non-preemptive contexts/RT kernel including raw_spin_locks (see gfp.h and ab00db216c9c7).
> >>>
> >>> Fix it by instructing stackdepot to not expand stack storage via alloc_pages()
> >>> in case it runs out by using kasan_record_aux_stack_noalloc().
> >>>
> >>> Jianwei Hu reported:
> >>> BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:969
> >>> in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 15319, name: python3
> >>> INFO: lockdep is turned off.
> >>> irq event stamp: 0
> >>> hardirqs last enabled at (0): [<0000000000000000>] 0x0
> >>> hardirqs last disabled at (0): [<ffffffff856c8b13>] copy_process+0xaf3/0x2590
> >>> softirqs last enabled at (0): [<ffffffff856c8b13>] copy_process+0xaf3/0x2590
> >>> softirqs last disabled at (0): [<0000000000000000>] 0x0
> >>> CPU: 6 PID: 15319 Comm: python3 Tainted: G W O 5.15-rc7-preempt-rt #1
> >>> Hardware name: Supermicro SYS-E300-9A-8C/A2SDi-8C-HLN4F, BIOS 1.1b 12/17/2018
> >>> Call Trace:
> >>> show_stack+0x52/0x58
> >>> dump_stack+0xa1/0xd6
> >>> ___might_sleep.cold+0x11c/0x12d
> >>> rt_spin_lock+0x3f/0xc0
> >>> rmqueue+0x100/0x1460
> >>> rmqueue+0x100/0x1460
> >>> mark_usage+0x1a0/0x1a0
> >>> ftrace_graph_ret_addr+0x2a/0xb0
> >>> rmqueue_pcplist.constprop.0+0x6a0/0x6a0
> >>> __kasan_check_read+0x11/0x20
> >>> __zone_watermark_ok+0x114/0x270
> >>> get_page_from_freelist+0x148/0x630
> >>> is_module_text_address+0x32/0xa0
> >>> __alloc_pages_nodemask+0x2f6/0x790
> >>> __alloc_pages_slowpath.constprop.0+0x12d0/0x12d0
> >>> create_prof_cpu_mask+0x30/0x30
> >>> alloc_pages_current+0xb1/0x150
> >>> stack_depot_save+0x39f/0x490
> >>> kasan_save_stack+0x42/0x50
> >>> kasan_save_stack+0x23/0x50
> >>> kasan_record_aux_stack+0xa9/0xc0
> >>> __call_rcu+0xff/0x9c0
> >>> call_rcu+0xe/0x10
> >>> put_object+0x53/0x70
> >>> __delete_object+0x7b/0x90
> >>> kmemleak_free+0x46/0x70
> >>> slab_free_freelist_hook+0xb4/0x160
> >>> kfree+0xe5/0x420
> >>> kfree_const+0x17/0x30
> >>> kobject_cleanup+0xaa/0x230
> >>> kobject_put+0x76/0x90
> >>> netdev_queue_update_kobjects+0x17d/0x1f0
> >>> ... ...
> >>> ksys_write+0xd9/0x180
> >>> __x64_sys_write+0x42/0x50
> >>> do_syscall_64+0x38/0x50
> >>> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >>>
> >>> Fixes: 84109ab58590 ("rcu: Record kvfree_call_rcu() call stack for KASAN")
> >>> Fixes: 26e760c9a7c8 ("rcu: kasan: record and print call_rcu() call stack")
> >>> Reported-by: Jianwei Hu <[email protected]>
> >>> Signed-off-by: Jun Miao <[email protected]>
> >>> ---
> >>> kernel/rcu/tree.c | 4 ++--
> >>> 1 file changed, 2 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> >>> index 8270e58cd0f3..2c1034580f15 100644
> >>> --- a/kernel/rcu/tree.c
> >>> +++ b/kernel/rcu/tree.c
> >>> @@ -3026,7 +3026,7 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func)
> >>> head->func = func;
> >>> head->next = NULL;
> >>> local_irq_save(flags);
> >>> - kasan_record_aux_stack(head);
> >>> + kasan_record_aux_stack_noalloc(head);
> >>> rdp = this_cpu_ptr(&rcu_data);
> >>>
> >>> /* Add the callback to our list. */
> >>> @@ -3591,7 +3591,7 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
> >>> return;
> >>> }
> >>>
> >>> - kasan_record_aux_stack(ptr);
> >>> + kasan_record_aux_stack_noalloc(ptr);
> >>> success = add_ptr_to_bulk_krc_lock(&krcp, &flags, ptr, !head);
> >>> if (!success) {
> >>> run_page_cache_worker(krcp);
> > Yep an allocation is tricky here. This change looks correct to me at
> > least from the point that it does not allocate.
> >
> > --
> > Uladzislau Rezki
>
> Thanks your approval. Could you like to give me a review?
>
Reviewed-by: Uladzislau Rezki (Sony) <[email protected]>

--
Uladzislau Rezki

2021-11-03 18:18:57

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: avoid alloc_pages() when recording stack

On Wed, Nov 03, 2021 at 02:55:48PM +0100, Uladzislau Rezki wrote:
> On Wed, Nov 3, 2021 at 7:51 AM Jun Miao <[email protected]> wrote:
> >
> >
> > On 11/2/21 10:53 PM, Uladzislau Rezki wrote:
> > > [Please note: This e-mail is from an EXTERNAL e-mail address]
> > >
> > >> Add KASAN maintainers
> > >>
> > >> On 11/1/21 6:31 PM, Jun Miao wrote:
> > >>> The default kasan_record_aux_stack() calls stack_depot_save() with GFP_NOWAIT,
> > >>> which in turn can then call alloc_pages(GFP_NOWAIT, ...). In general, however,
> > >>> it is not even possible to use either GFP_ATOMIC nor GFP_NOWAIT in certain
> > >>> non-preemptive contexts/RT kernel including raw_spin_locks (see gfp.h and ab00db216c9c7).
> > >>>
> > >>> Fix it by instructing stackdepot to not expand stack storage via alloc_pages()
> > >>> in case it runs out by using kasan_record_aux_stack_noalloc().
> > >>>
> > >>> Jianwei Hu reported:
> > >>> BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:969
> > >>> in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 15319, name: python3
> > >>> INFO: lockdep is turned off.
> > >>> irq event stamp: 0
> > >>> hardirqs last enabled at (0): [<0000000000000000>] 0x0
> > >>> hardirqs last disabled at (0): [<ffffffff856c8b13>] copy_process+0xaf3/0x2590
> > >>> softirqs last enabled at (0): [<ffffffff856c8b13>] copy_process+0xaf3/0x2590
> > >>> softirqs last disabled at (0): [<0000000000000000>] 0x0
> > >>> CPU: 6 PID: 15319 Comm: python3 Tainted: G W O 5.15-rc7-preempt-rt #1
> > >>> Hardware name: Supermicro SYS-E300-9A-8C/A2SDi-8C-HLN4F, BIOS 1.1b 12/17/2018
> > >>> Call Trace:
> > >>> show_stack+0x52/0x58
> > >>> dump_stack+0xa1/0xd6
> > >>> ___might_sleep.cold+0x11c/0x12d
> > >>> rt_spin_lock+0x3f/0xc0
> > >>> rmqueue+0x100/0x1460
> > >>> rmqueue+0x100/0x1460
> > >>> mark_usage+0x1a0/0x1a0
> > >>> ftrace_graph_ret_addr+0x2a/0xb0
> > >>> rmqueue_pcplist.constprop.0+0x6a0/0x6a0
> > >>> __kasan_check_read+0x11/0x20
> > >>> __zone_watermark_ok+0x114/0x270
> > >>> get_page_from_freelist+0x148/0x630
> > >>> is_module_text_address+0x32/0xa0
> > >>> __alloc_pages_nodemask+0x2f6/0x790
> > >>> __alloc_pages_slowpath.constprop.0+0x12d0/0x12d0
> > >>> create_prof_cpu_mask+0x30/0x30
> > >>> alloc_pages_current+0xb1/0x150
> > >>> stack_depot_save+0x39f/0x490
> > >>> kasan_save_stack+0x42/0x50
> > >>> kasan_save_stack+0x23/0x50
> > >>> kasan_record_aux_stack+0xa9/0xc0
> > >>> __call_rcu+0xff/0x9c0
> > >>> call_rcu+0xe/0x10
> > >>> put_object+0x53/0x70
> > >>> __delete_object+0x7b/0x90
> > >>> kmemleak_free+0x46/0x70
> > >>> slab_free_freelist_hook+0xb4/0x160
> > >>> kfree+0xe5/0x420
> > >>> kfree_const+0x17/0x30
> > >>> kobject_cleanup+0xaa/0x230
> > >>> kobject_put+0x76/0x90
> > >>> netdev_queue_update_kobjects+0x17d/0x1f0
> > >>> ... ...
> > >>> ksys_write+0xd9/0x180
> > >>> __x64_sys_write+0x42/0x50
> > >>> do_syscall_64+0x38/0x50
> > >>> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > >>>
> > >>> Fixes: 84109ab58590 ("rcu: Record kvfree_call_rcu() call stack for KASAN")
> > >>> Fixes: 26e760c9a7c8 ("rcu: kasan: record and print call_rcu() call stack")
> > >>> Reported-by: Jianwei Hu <[email protected]>
> > >>> Signed-off-by: Jun Miao <[email protected]>
> > >>> ---
> > >>> kernel/rcu/tree.c | 4 ++--
> > >>> 1 file changed, 2 insertions(+), 2 deletions(-)
> > >>>
> > >>> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > >>> index 8270e58cd0f3..2c1034580f15 100644
> > >>> --- a/kernel/rcu/tree.c
> > >>> +++ b/kernel/rcu/tree.c
> > >>> @@ -3026,7 +3026,7 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func)
> > >>> head->func = func;
> > >>> head->next = NULL;
> > >>> local_irq_save(flags);
> > >>> - kasan_record_aux_stack(head);
> > >>> + kasan_record_aux_stack_noalloc(head);
> > >>> rdp = this_cpu_ptr(&rcu_data);
> > >>>
> > >>> /* Add the callback to our list. */
> > >>> @@ -3591,7 +3591,7 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
> > >>> return;
> > >>> }
> > >>>
> > >>> - kasan_record_aux_stack(ptr);
> > >>> + kasan_record_aux_stack_noalloc(ptr);
> > >>> success = add_ptr_to_bulk_krc_lock(&krcp, &flags, ptr, !head);
> > >>> if (!success) {
> > >>> run_page_cache_worker(krcp);
> > > Yep an allocation is tricky here. This change looks correct to me at
> > > least from the point that it does not allocate.
> > >
> > > --
> > > Uladzislau Rezki
> >
> > Thanks your approval. Could you like to give me a review?
> >
> Reviewed-by: Uladzislau Rezki (Sony) <[email protected]>

I have queued it for review and testing, thank you both! I do have
some remaining concerns about this code being starved for memory. I am
wondering if the code needs to check the interrupt state. And perhaps
also whether locks are held. I of course will refrain from sending
this to mainline until these concerns are resolved.

Marco, Dmitry, thoughts?

Thanx, Paul

2021-11-03 21:23:11

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: avoid alloc_pages() when recording stack

On Wed, Nov 03, 2021 at 11:13:15AM -0700, Paul E. McKenney wrote:
> On Wed, Nov 03, 2021 at 02:55:48PM +0100, Uladzislau Rezki wrote:
> > On Wed, Nov 3, 2021 at 7:51 AM Jun Miao <[email protected]> wrote:
> > >
> > >
> > > On 11/2/21 10:53 PM, Uladzislau Rezki wrote:
> > > > [Please note: This e-mail is from an EXTERNAL e-mail address]
> > > >
> > > >> Add KASAN maintainers
> > > >>
> > > >> On 11/1/21 6:31 PM, Jun Miao wrote:
> > > >>> The default kasan_record_aux_stack() calls stack_depot_save() with GFP_NOWAIT,
> > > >>> which in turn can then call alloc_pages(GFP_NOWAIT, ...). In general, however,
> > > >>> it is not even possible to use either GFP_ATOMIC nor GFP_NOWAIT in certain
> > > >>> non-preemptive contexts/RT kernel including raw_spin_locks (see gfp.h and ab00db216c9c7).
> > > >>>
> > > >>> Fix it by instructing stackdepot to not expand stack storage via alloc_pages()
> > > >>> in case it runs out by using kasan_record_aux_stack_noalloc().
> > > >>>
> > > >>> Jianwei Hu reported:
> > > >>> BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:969
> > > >>> in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 15319, name: python3
> > > >>> INFO: lockdep is turned off.
> > > >>> irq event stamp: 0
> > > >>> hardirqs last enabled at (0): [<0000000000000000>] 0x0
> > > >>> hardirqs last disabled at (0): [<ffffffff856c8b13>] copy_process+0xaf3/0x2590
> > > >>> softirqs last enabled at (0): [<ffffffff856c8b13>] copy_process+0xaf3/0x2590
> > > >>> softirqs last disabled at (0): [<0000000000000000>] 0x0
> > > >>> CPU: 6 PID: 15319 Comm: python3 Tainted: G W O 5.15-rc7-preempt-rt #1
> > > >>> Hardware name: Supermicro SYS-E300-9A-8C/A2SDi-8C-HLN4F, BIOS 1.1b 12/17/2018
> > > >>> Call Trace:
> > > >>> show_stack+0x52/0x58
> > > >>> dump_stack+0xa1/0xd6
> > > >>> ___might_sleep.cold+0x11c/0x12d
> > > >>> rt_spin_lock+0x3f/0xc0
> > > >>> rmqueue+0x100/0x1460
> > > >>> rmqueue+0x100/0x1460
> > > >>> mark_usage+0x1a0/0x1a0
> > > >>> ftrace_graph_ret_addr+0x2a/0xb0
> > > >>> rmqueue_pcplist.constprop.0+0x6a0/0x6a0
> > > >>> __kasan_check_read+0x11/0x20
> > > >>> __zone_watermark_ok+0x114/0x270
> > > >>> get_page_from_freelist+0x148/0x630
> > > >>> is_module_text_address+0x32/0xa0
> > > >>> __alloc_pages_nodemask+0x2f6/0x790
> > > >>> __alloc_pages_slowpath.constprop.0+0x12d0/0x12d0
> > > >>> create_prof_cpu_mask+0x30/0x30
> > > >>> alloc_pages_current+0xb1/0x150
> > > >>> stack_depot_save+0x39f/0x490
> > > >>> kasan_save_stack+0x42/0x50
> > > >>> kasan_save_stack+0x23/0x50
> > > >>> kasan_record_aux_stack+0xa9/0xc0
> > > >>> __call_rcu+0xff/0x9c0
> > > >>> call_rcu+0xe/0x10
> > > >>> put_object+0x53/0x70
> > > >>> __delete_object+0x7b/0x90
> > > >>> kmemleak_free+0x46/0x70
> > > >>> slab_free_freelist_hook+0xb4/0x160
> > > >>> kfree+0xe5/0x420
> > > >>> kfree_const+0x17/0x30
> > > >>> kobject_cleanup+0xaa/0x230
> > > >>> kobject_put+0x76/0x90
> > > >>> netdev_queue_update_kobjects+0x17d/0x1f0
> > > >>> ... ...
> > > >>> ksys_write+0xd9/0x180
> > > >>> __x64_sys_write+0x42/0x50
> > > >>> do_syscall_64+0x38/0x50
> > > >>> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > >>>
> > > >>> Fixes: 84109ab58590 ("rcu: Record kvfree_call_rcu() call stack for KASAN")
> > > >>> Fixes: 26e760c9a7c8 ("rcu: kasan: record and print call_rcu() call stack")
> > > >>> Reported-by: Jianwei Hu <[email protected]>
> > > >>> Signed-off-by: Jun Miao <[email protected]>
> > > >>> ---
> > > >>> kernel/rcu/tree.c | 4 ++--
> > > >>> 1 file changed, 2 insertions(+), 2 deletions(-)
> > > >>>
> > > >>> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > >>> index 8270e58cd0f3..2c1034580f15 100644
> > > >>> --- a/kernel/rcu/tree.c
> > > >>> +++ b/kernel/rcu/tree.c
> > > >>> @@ -3026,7 +3026,7 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func)
> > > >>> head->func = func;
> > > >>> head->next = NULL;
> > > >>> local_irq_save(flags);
> > > >>> - kasan_record_aux_stack(head);
> > > >>> + kasan_record_aux_stack_noalloc(head);
> > > >>> rdp = this_cpu_ptr(&rcu_data);
> > > >>>
> > > >>> /* Add the callback to our list. */
> > > >>> @@ -3591,7 +3591,7 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
> > > >>> return;
> > > >>> }
> > > >>>
> > > >>> - kasan_record_aux_stack(ptr);
> > > >>> + kasan_record_aux_stack_noalloc(ptr);
> > > >>> success = add_ptr_to_bulk_krc_lock(&krcp, &flags, ptr, !head);
> > > >>> if (!success) {
> > > >>> run_page_cache_worker(krcp);
> > > > Yep an allocation is tricky here. This change looks correct to me at
> > > > least from the point that it does not allocate.
> > > >
> > > > --
> > > > Uladzislau Rezki
> > >
> > > Thanks your approval. Could you like to give me a review?
> > >
> > Reviewed-by: Uladzislau Rezki (Sony) <[email protected]>
>
> I have queued it for review and testing, thank you both! I do have
> some remaining concerns about this code being starved for memory. I am
> wondering if the code needs to check the interrupt state. And perhaps
> also whether locks are held. I of course will refrain from sending
> this to mainline until these concerns are resolved.
>
> Marco, Dmitry, thoughts?

Well, the compiler does have an opinion:

kernel/rcu/tree.c: In function ‘__call_rcu’:
kernel/rcu/tree.c:3029:2: error: implicit declaration of function ‘kasan_record_aux_stack_noalloc’; did you mean ‘kasan_record_aux_stack’? [-Werror=implicit-function-declaration]
3029 | kasan_record_aux_stack_noalloc(head);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| kasan_record_aux_stack

I get the same message after merging in current mainline.

I have therefore dropped this patch for the time being.

Thanx, Paul

2021-11-04 01:12:18

by Jun Miao

[permalink] [raw]
Subject: Re: [PATCH] rcu: avoid alloc_pages() when recording stack


On 11/4/21 5:21 AM, Paul E. McKenney wrote:
> [Please note: This e-mail is from an EXTERNAL e-mail address]
>
> On Wed, Nov 03, 2021 at 11:13:15AM -0700, Paul E. McKenney wrote:
>> On Wed, Nov 03, 2021 at 02:55:48PM +0100, Uladzislau Rezki wrote:
>>> On Wed, Nov 3, 2021 at 7:51 AM Jun Miao <[email protected]> wrote:
>>>>
>>>> On 11/2/21 10:53 PM, Uladzislau Rezki wrote:
>>>>> [Please note: This e-mail is from an EXTERNAL e-mail address]
>>>>>
>>>>>> Add KASAN maintainers
>>>>>>
>>>>>> On 11/1/21 6:31 PM, Jun Miao wrote:
>>>>>>> The default kasan_record_aux_stack() calls stack_depot_save() with GFP_NOWAIT,
>>>>>>> which in turn can then call alloc_pages(GFP_NOWAIT, ...). In general, however,
>>>>>>> it is not even possible to use either GFP_ATOMIC nor GFP_NOWAIT in certain
>>>>>>> non-preemptive contexts/RT kernel including raw_spin_locks (see gfp.h and ab00db216c9c7).
>>>>>>>
>>>>>>> Fix it by instructing stackdepot to not expand stack storage via alloc_pages()
>>>>>>> in case it runs out by using kasan_record_aux_stack_noalloc().
>>>>>>>
>>>>>>> Jianwei Hu reported:
>>>>>>> BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:969
>>>>>>> in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 15319, name: python3
>>>>>>> INFO: lockdep is turned off.
>>>>>>> irq event stamp: 0
>>>>>>> hardirqs last enabled at (0): [<0000000000000000>] 0x0
>>>>>>> hardirqs last disabled at (0): [<ffffffff856c8b13>] copy_process+0xaf3/0x2590
>>>>>>> softirqs last enabled at (0): [<ffffffff856c8b13>] copy_process+0xaf3/0x2590
>>>>>>> softirqs last disabled at (0): [<0000000000000000>] 0x0
>>>>>>> CPU: 6 PID: 15319 Comm: python3 Tainted: G W O 5.15-rc7-preempt-rt #1
>>>>>>> Hardware name: Supermicro SYS-E300-9A-8C/A2SDi-8C-HLN4F, BIOS 1.1b 12/17/2018
>>>>>>> Call Trace:
>>>>>>> show_stack+0x52/0x58
>>>>>>> dump_stack+0xa1/0xd6
>>>>>>> ___might_sleep.cold+0x11c/0x12d
>>>>>>> rt_spin_lock+0x3f/0xc0
>>>>>>> rmqueue+0x100/0x1460
>>>>>>> rmqueue+0x100/0x1460
>>>>>>> mark_usage+0x1a0/0x1a0
>>>>>>> ftrace_graph_ret_addr+0x2a/0xb0
>>>>>>> rmqueue_pcplist.constprop.0+0x6a0/0x6a0
>>>>>>> __kasan_check_read+0x11/0x20
>>>>>>> __zone_watermark_ok+0x114/0x270
>>>>>>> get_page_from_freelist+0x148/0x630
>>>>>>> is_module_text_address+0x32/0xa0
>>>>>>> __alloc_pages_nodemask+0x2f6/0x790
>>>>>>> __alloc_pages_slowpath.constprop.0+0x12d0/0x12d0
>>>>>>> create_prof_cpu_mask+0x30/0x30
>>>>>>> alloc_pages_current+0xb1/0x150
>>>>>>> stack_depot_save+0x39f/0x490
>>>>>>> kasan_save_stack+0x42/0x50
>>>>>>> kasan_save_stack+0x23/0x50
>>>>>>> kasan_record_aux_stack+0xa9/0xc0
>>>>>>> __call_rcu+0xff/0x9c0
>>>>>>> call_rcu+0xe/0x10
>>>>>>> put_object+0x53/0x70
>>>>>>> __delete_object+0x7b/0x90
>>>>>>> kmemleak_free+0x46/0x70
>>>>>>> slab_free_freelist_hook+0xb4/0x160
>>>>>>> kfree+0xe5/0x420
>>>>>>> kfree_const+0x17/0x30
>>>>>>> kobject_cleanup+0xaa/0x230
>>>>>>> kobject_put+0x76/0x90
>>>>>>> netdev_queue_update_kobjects+0x17d/0x1f0
>>>>>>> ... ...
>>>>>>> ksys_write+0xd9/0x180
>>>>>>> __x64_sys_write+0x42/0x50
>>>>>>> do_syscall_64+0x38/0x50
>>>>>>> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>>>>>>
>>>>>>> Fixes: 84109ab58590 ("rcu: Record kvfree_call_rcu() call stack for KASAN")
>>>>>>> Fixes: 26e760c9a7c8 ("rcu: kasan: record and print call_rcu() call stack")
>>>>>>> Reported-by: Jianwei Hu <[email protected]>
>>>>>>> Signed-off-by: Jun Miao <[email protected]>
>>>>>>> ---
>>>>>>> kernel/rcu/tree.c | 4 ++--
>>>>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
>>>>>>> index 8270e58cd0f3..2c1034580f15 100644
>>>>>>> --- a/kernel/rcu/tree.c
>>>>>>> +++ b/kernel/rcu/tree.c
>>>>>>> @@ -3026,7 +3026,7 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func)
>>>>>>> head->func = func;
>>>>>>> head->next = NULL;
>>>>>>> local_irq_save(flags);
>>>>>>> - kasan_record_aux_stack(head);
>>>>>>> + kasan_record_aux_stack_noalloc(head);
>>>>>>> rdp = this_cpu_ptr(&rcu_data);
>>>>>>>
>>>>>>> /* Add the callback to our list. */
>>>>>>> @@ -3591,7 +3591,7 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
>>>>>>> return;
>>>>>>> }
>>>>>>>
>>>>>>> - kasan_record_aux_stack(ptr);
>>>>>>> + kasan_record_aux_stack_noalloc(ptr);
>>>>>>> success = add_ptr_to_bulk_krc_lock(&krcp, &flags, ptr, !head);
>>>>>>> if (!success) {
>>>>>>> run_page_cache_worker(krcp);
>>>>> Yep an allocation is tricky here. This change looks correct to me at
>>>>> least from the point that it does not allocate.
>>>>>
>>>>> --
>>>>> Uladzislau Rezki
>>>> Thanks your approval. Could you like to give me a review?
>>>>
>>> Reviewed-by: Uladzislau Rezki (Sony) <[email protected]>
>> I have queued it for review and testing, thank you both! I do have
>> some remaining concerns about this code being starved for memory. I am
>> wondering if the code needs to check the interrupt state. And perhaps
>> also whether locks are held. I of course will refrain from sending
>> this to mainline until these concerns are resolved.
>>
>> Marco, Dmitry, thoughts?
> Well, the compiler does have an opinion:
>
> kernel/rcu/tree.c: In function ‘__call_rcu’:
> kernel/rcu/tree.c:3029:2: error: implicit declaration of function ‘kasan_record_aux_stack_noalloc’; did you mean ‘kasan_record_aux_stack’? [-Werror=implicit-function-declaration]
> 3029 | kasan_record_aux_stack_noalloc(head);
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> | kasan_record_aux_stack
>
> I get the same message after merging in current mainline.
>
> I have therefore dropped this patch for the time being.
>
> Thanx, Paul
Hi Paul E,
The kasan_record_aux_stack_noalloc() is just introduce to linux-next
now, and marking "Notice: this object is not reachable from any branch."
in commit.
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/include/linux/kasan.h?h=next-20211029&id=2f64acf6b653d01fbdc92a693f12bbf71a205926

--- ---
Jun Miao

2021-11-04 01:35:44

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu: avoid alloc_pages() when recording stack

On Thu, Nov 04, 2021 at 09:09:24AM +0800, Jun Miao wrote:
>
> On 11/4/21 5:21 AM, Paul E. McKenney wrote:
> > [Please note: This e-mail is from an EXTERNAL e-mail address]
> >
> > On Wed, Nov 03, 2021 at 11:13:15AM -0700, Paul E. McKenney wrote:
> > > On Wed, Nov 03, 2021 at 02:55:48PM +0100, Uladzislau Rezki wrote:
> > > > On Wed, Nov 3, 2021 at 7:51 AM Jun Miao <[email protected]> wrote:
> > > > >
> > > > > On 11/2/21 10:53 PM, Uladzislau Rezki wrote:
> > > > > > [Please note: This e-mail is from an EXTERNAL e-mail address]
> > > > > >
> > > > > > > Add KASAN maintainers
> > > > > > >
> > > > > > > On 11/1/21 6:31 PM, Jun Miao wrote:
> > > > > > > > The default kasan_record_aux_stack() calls stack_depot_save() with GFP_NOWAIT,
> > > > > > > > which in turn can then call alloc_pages(GFP_NOWAIT, ...). In general, however,
> > > > > > > > it is not even possible to use either GFP_ATOMIC nor GFP_NOWAIT in certain
> > > > > > > > non-preemptive contexts/RT kernel including raw_spin_locks (see gfp.h and ab00db216c9c7).
> > > > > > > >
> > > > > > > > Fix it by instructing stackdepot to not expand stack storage via alloc_pages()
> > > > > > > > in case it runs out by using kasan_record_aux_stack_noalloc().
> > > > > > > >
> > > > > > > > Jianwei Hu reported:
> > > > > > > > BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:969
> > > > > > > > in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 15319, name: python3
> > > > > > > > INFO: lockdep is turned off.
> > > > > > > > irq event stamp: 0
> > > > > > > > hardirqs last enabled at (0): [<0000000000000000>] 0x0
> > > > > > > > hardirqs last disabled at (0): [<ffffffff856c8b13>] copy_process+0xaf3/0x2590
> > > > > > > > softirqs last enabled at (0): [<ffffffff856c8b13>] copy_process+0xaf3/0x2590
> > > > > > > > softirqs last disabled at (0): [<0000000000000000>] 0x0
> > > > > > > > CPU: 6 PID: 15319 Comm: python3 Tainted: G W O 5.15-rc7-preempt-rt #1
> > > > > > > > Hardware name: Supermicro SYS-E300-9A-8C/A2SDi-8C-HLN4F, BIOS 1.1b 12/17/2018
> > > > > > > > Call Trace:
> > > > > > > > show_stack+0x52/0x58
> > > > > > > > dump_stack+0xa1/0xd6
> > > > > > > > ___might_sleep.cold+0x11c/0x12d
> > > > > > > > rt_spin_lock+0x3f/0xc0
> > > > > > > > rmqueue+0x100/0x1460
> > > > > > > > rmqueue+0x100/0x1460
> > > > > > > > mark_usage+0x1a0/0x1a0
> > > > > > > > ftrace_graph_ret_addr+0x2a/0xb0
> > > > > > > > rmqueue_pcplist.constprop.0+0x6a0/0x6a0
> > > > > > > > __kasan_check_read+0x11/0x20
> > > > > > > > __zone_watermark_ok+0x114/0x270
> > > > > > > > get_page_from_freelist+0x148/0x630
> > > > > > > > is_module_text_address+0x32/0xa0
> > > > > > > > __alloc_pages_nodemask+0x2f6/0x790
> > > > > > > > __alloc_pages_slowpath.constprop.0+0x12d0/0x12d0
> > > > > > > > create_prof_cpu_mask+0x30/0x30
> > > > > > > > alloc_pages_current+0xb1/0x150
> > > > > > > > stack_depot_save+0x39f/0x490
> > > > > > > > kasan_save_stack+0x42/0x50
> > > > > > > > kasan_save_stack+0x23/0x50
> > > > > > > > kasan_record_aux_stack+0xa9/0xc0
> > > > > > > > __call_rcu+0xff/0x9c0
> > > > > > > > call_rcu+0xe/0x10
> > > > > > > > put_object+0x53/0x70
> > > > > > > > __delete_object+0x7b/0x90
> > > > > > > > kmemleak_free+0x46/0x70
> > > > > > > > slab_free_freelist_hook+0xb4/0x160
> > > > > > > > kfree+0xe5/0x420
> > > > > > > > kfree_const+0x17/0x30
> > > > > > > > kobject_cleanup+0xaa/0x230
> > > > > > > > kobject_put+0x76/0x90
> > > > > > > > netdev_queue_update_kobjects+0x17d/0x1f0
> > > > > > > > ... ...
> > > > > > > > ksys_write+0xd9/0x180
> > > > > > > > __x64_sys_write+0x42/0x50
> > > > > > > > do_syscall_64+0x38/0x50
> > > > > > > > entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > > > > > >
> > > > > > > > Fixes: 84109ab58590 ("rcu: Record kvfree_call_rcu() call stack for KASAN")
> > > > > > > > Fixes: 26e760c9a7c8 ("rcu: kasan: record and print call_rcu() call stack")
> > > > > > > > Reported-by: Jianwei Hu <[email protected]>
> > > > > > > > Signed-off-by: Jun Miao <[email protected]>
> > > > > > > > ---
> > > > > > > > kernel/rcu/tree.c | 4 ++--
> > > > > > > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > > > > > index 8270e58cd0f3..2c1034580f15 100644
> > > > > > > > --- a/kernel/rcu/tree.c
> > > > > > > > +++ b/kernel/rcu/tree.c
> > > > > > > > @@ -3026,7 +3026,7 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func)
> > > > > > > > head->func = func;
> > > > > > > > head->next = NULL;
> > > > > > > > local_irq_save(flags);
> > > > > > > > - kasan_record_aux_stack(head);
> > > > > > > > + kasan_record_aux_stack_noalloc(head);
> > > > > > > > rdp = this_cpu_ptr(&rcu_data);
> > > > > > > >
> > > > > > > > /* Add the callback to our list. */
> > > > > > > > @@ -3591,7 +3591,7 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
> > > > > > > > return;
> > > > > > > > }
> > > > > > > >
> > > > > > > > - kasan_record_aux_stack(ptr);
> > > > > > > > + kasan_record_aux_stack_noalloc(ptr);
> > > > > > > > success = add_ptr_to_bulk_krc_lock(&krcp, &flags, ptr, !head);
> > > > > > > > if (!success) {
> > > > > > > > run_page_cache_worker(krcp);
> > > > > > Yep an allocation is tricky here. This change looks correct to me at
> > > > > > least from the point that it does not allocate.
> > > > > >
> > > > > > --
> > > > > > Uladzislau Rezki
> > > > > Thanks your approval. Could you like to give me a review?
> > > > >
> > > > Reviewed-by: Uladzislau Rezki (Sony) <[email protected]>
> > > I have queued it for review and testing, thank you both! I do have
> > > some remaining concerns about this code being starved for memory. I am
> > > wondering if the code needs to check the interrupt state. And perhaps
> > > also whether locks are held. I of course will refrain from sending
> > > this to mainline until these concerns are resolved.
> > >
> > > Marco, Dmitry, thoughts?
> > Well, the compiler does have an opinion:
> >
> > kernel/rcu/tree.c: In function ‘__call_rcu’:
> > kernel/rcu/tree.c:3029:2: error: implicit declaration of function ‘kasan_record_aux_stack_noalloc’; did you mean ‘kasan_record_aux_stack’? [-Werror=implicit-function-declaration]
> > 3029 | kasan_record_aux_stack_noalloc(head);
> > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > | kasan_record_aux_stack
> >
> > I get the same message after merging in current mainline.
> >
> > I have therefore dropped this patch for the time being.
> >
> > Thanx, Paul
> Hi Paul E,
> The kasan_record_aux_stack_noalloc() is just introduce to linux-next now,
> and marking "Notice: this object is not reachable from any branch." in
> commit.
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/include/linux/kasan.h?h=next-20211029&id=2f64acf6b653d01fbdc92a693f12bbf71a205926

That would explain it! Feel free to resend once the functionality is
more generally available.

Thanx, Paul

2021-11-08 14:23:15

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH] rcu: avoid alloc_pages() when recording stack

On Thu, 4 Nov 2021 at 02:28, Paul E. McKenney <[email protected]> wrote:
>
> On Thu, Nov 04, 2021 at 09:09:24AM +0800, Jun Miao wrote:
> >
> > On 11/4/21 5:21 AM, Paul E. McKenney wrote:
> > > [Please note: This e-mail is from an EXTERNAL e-mail address]
> > >
> > > On Wed, Nov 03, 2021 at 11:13:15AM -0700, Paul E. McKenney wrote:
> > > > On Wed, Nov 03, 2021 at 02:55:48PM +0100, Uladzislau Rezki wrote:
> > > > > On Wed, Nov 3, 2021 at 7:51 AM Jun Miao <[email protected]> wrote:
> > > > > >
> > > > > > On 11/2/21 10:53 PM, Uladzislau Rezki wrote:
> > > > > > > [Please note: This e-mail is from an EXTERNAL e-mail address]
> > > > > > >
> > > > > > > > Add KASAN maintainers
> > > > > > > >
> > > > > > > > On 11/1/21 6:31 PM, Jun Miao wrote:
> > > > > > > > > The default kasan_record_aux_stack() calls stack_depot_save() with GFP_NOWAIT,
> > > > > > > > > which in turn can then call alloc_pages(GFP_NOWAIT, ...). In general, however,
> > > > > > > > > it is not even possible to use either GFP_ATOMIC nor GFP_NOWAIT in certain
> > > > > > > > > non-preemptive contexts/RT kernel including raw_spin_locks (see gfp.h and ab00db216c9c7).
> > > > > > > > >
> > > > > > > > > Fix it by instructing stackdepot to not expand stack storage via alloc_pages()
> > > > > > > > > in case it runs out by using kasan_record_aux_stack_noalloc().
> > > > > > > > >
> > > > > > > > > Jianwei Hu reported:
> > > > > > > > > BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:969
> > > > > > > > > in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 15319, name: python3
> > > > > > > > > INFO: lockdep is turned off.
> > > > > > > > > irq event stamp: 0
> > > > > > > > > hardirqs last enabled at (0): [<0000000000000000>] 0x0
> > > > > > > > > hardirqs last disabled at (0): [<ffffffff856c8b13>] copy_process+0xaf3/0x2590
> > > > > > > > > softirqs last enabled at (0): [<ffffffff856c8b13>] copy_process+0xaf3/0x2590
> > > > > > > > > softirqs last disabled at (0): [<0000000000000000>] 0x0
> > > > > > > > > CPU: 6 PID: 15319 Comm: python3 Tainted: G W O 5.15-rc7-preempt-rt #1
> > > > > > > > > Hardware name: Supermicro SYS-E300-9A-8C/A2SDi-8C-HLN4F, BIOS 1.1b 12/17/2018
> > > > > > > > > Call Trace:
> > > > > > > > > show_stack+0x52/0x58
> > > > > > > > > dump_stack+0xa1/0xd6
> > > > > > > > > ___might_sleep.cold+0x11c/0x12d
> > > > > > > > > rt_spin_lock+0x3f/0xc0
> > > > > > > > > rmqueue+0x100/0x1460
> > > > > > > > > rmqueue+0x100/0x1460
> > > > > > > > > mark_usage+0x1a0/0x1a0
> > > > > > > > > ftrace_graph_ret_addr+0x2a/0xb0
> > > > > > > > > rmqueue_pcplist.constprop.0+0x6a0/0x6a0
> > > > > > > > > __kasan_check_read+0x11/0x20
> > > > > > > > > __zone_watermark_ok+0x114/0x270
> > > > > > > > > get_page_from_freelist+0x148/0x630
> > > > > > > > > is_module_text_address+0x32/0xa0
> > > > > > > > > __alloc_pages_nodemask+0x2f6/0x790
> > > > > > > > > __alloc_pages_slowpath.constprop.0+0x12d0/0x12d0
> > > > > > > > > create_prof_cpu_mask+0x30/0x30
> > > > > > > > > alloc_pages_current+0xb1/0x150
> > > > > > > > > stack_depot_save+0x39f/0x490
> > > > > > > > > kasan_save_stack+0x42/0x50
> > > > > > > > > kasan_save_stack+0x23/0x50
> > > > > > > > > kasan_record_aux_stack+0xa9/0xc0
> > > > > > > > > __call_rcu+0xff/0x9c0
> > > > > > > > > call_rcu+0xe/0x10
> > > > > > > > > put_object+0x53/0x70
> > > > > > > > > __delete_object+0x7b/0x90
> > > > > > > > > kmemleak_free+0x46/0x70
> > > > > > > > > slab_free_freelist_hook+0xb4/0x160
> > > > > > > > > kfree+0xe5/0x420
> > > > > > > > > kfree_const+0x17/0x30
> > > > > > > > > kobject_cleanup+0xaa/0x230
> > > > > > > > > kobject_put+0x76/0x90
> > > > > > > > > netdev_queue_update_kobjects+0x17d/0x1f0
> > > > > > > > > ... ...
> > > > > > > > > ksys_write+0xd9/0x180
> > > > > > > > > __x64_sys_write+0x42/0x50
> > > > > > > > > do_syscall_64+0x38/0x50
> > > > > > > > > entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > > > > > > >
> > > > > > > > > Fixes: 84109ab58590 ("rcu: Record kvfree_call_rcu() call stack for KASAN")
> > > > > > > > > Fixes: 26e760c9a7c8 ("rcu: kasan: record and print call_rcu() call stack")
> > > > > > > > > Reported-by: Jianwei Hu <[email protected]>
> > > > > > > > > Signed-off-by: Jun Miao <[email protected]>
> > > > > > > > > ---
> > > > > > > > > kernel/rcu/tree.c | 4 ++--
> > > > > > > > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > > > > > > > >
> > > > > > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > > > > > > index 8270e58cd0f3..2c1034580f15 100644
> > > > > > > > > --- a/kernel/rcu/tree.c
> > > > > > > > > +++ b/kernel/rcu/tree.c
> > > > > > > > > @@ -3026,7 +3026,7 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func)
> > > > > > > > > head->func = func;
> > > > > > > > > head->next = NULL;
> > > > > > > > > local_irq_save(flags);
> > > > > > > > > - kasan_record_aux_stack(head);
> > > > > > > > > + kasan_record_aux_stack_noalloc(head);
> > > > > > > > > rdp = this_cpu_ptr(&rcu_data);
> > > > > > > > >
> > > > > > > > > /* Add the callback to our list. */
> > > > > > > > > @@ -3591,7 +3591,7 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
> > > > > > > > > return;
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > - kasan_record_aux_stack(ptr);
> > > > > > > > > + kasan_record_aux_stack_noalloc(ptr);
> > > > > > > > > success = add_ptr_to_bulk_krc_lock(&krcp, &flags, ptr, !head);
> > > > > > > > > if (!success) {
> > > > > > > > > run_page_cache_worker(krcp);
> > > > > > > Yep an allocation is tricky here. This change looks correct to me at
> > > > > > > least from the point that it does not allocate.
> > > > > > >
> > > > > > > --
> > > > > > > Uladzislau Rezki
> > > > > > Thanks your approval. Could you like to give me a review?
> > > > > >
> > > > > Reviewed-by: Uladzislau Rezki (Sony) <[email protected]>
> > > > I have queued it for review and testing, thank you both! I do have
> > > > some remaining concerns about this code being starved for memory. I am
> > > > wondering if the code needs to check the interrupt state. And perhaps
> > > > also whether locks are held. I of course will refrain from sending
> > > > this to mainline until these concerns are resolved.
> > > >
> > > > Marco, Dmitry, thoughts?
> > > Well, the compiler does have an opinion:
> > >
> > > kernel/rcu/tree.c: In function ‘__call_rcu’:
> > > kernel/rcu/tree.c:3029:2: error: implicit declaration of function ‘kasan_record_aux_stack_noalloc’; did you mean ‘kasan_record_aux_stack’? [-Werror=implicit-function-declaration]
> > > 3029 | kasan_record_aux_stack_noalloc(head);
> > > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > | kasan_record_aux_stack
> > >
> > > I get the same message after merging in current mainline.
> > >
> > > I have therefore dropped this patch for the time being.
> > >
> > > Thanx, Paul
> > Hi Paul E,
> > The kasan_record_aux_stack_noalloc() is just introduce to linux-next now,
> > and marking "Notice: this object is not reachable from any branch." in
> > commit.
> > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/include/linux/kasan.h?h=next-20211029&id=2f64acf6b653d01fbdc92a693f12bbf71a205926
>
> That would explain it! Feel free to resend once the functionality is
> more generally available.

[email protected] mailing list

I found the full commit with kasan_record_aux_stack_noalloc() implementation:
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20211029&id=2f64acf6b653d01fbdc92a693f12bbf71a205926

but it calls kasan_save_stack() with second bool argument, and
kasan_save_stack() accepts only 1 argument:
https://elixir.bootlin.com/linux/latest/source/mm/kasan/common.c#L33
so I am lost and can't comment on any of the Paul's questions re
interrupts/spinlocks.

When re-sending this, please add kasan-dev@ mailing list and add links
to the dependencies.

Thanks

2021-11-08 18:12:48

by Marco Elver

[permalink] [raw]
Subject: Re: [PATCH] rcu: avoid alloc_pages() when recording stack

On Mon, 8 Nov 2021 at 12:42, 'Dmitry Vyukov' via kasan-dev
<[email protected]> wrote:
[...]
> > > > > > Reviewed-by: Uladzislau Rezki (Sony) <[email protected]>
> > > > > I have queued it for review and testing, thank you both! I do have
> > > > > some remaining concerns about this code being starved for memory. I am
> > > > > wondering if the code needs to check the interrupt state. And perhaps
> > > > > also whether locks are held. I of course will refrain from sending
> > > > > this to mainline until these concerns are resolved.
> > > > >
> > > > > Marco, Dmitry, thoughts?

It's a general limitation of kasan_record_aux_stack_noalloc(), and if
stackdepot's pool is exhausted, we just don't record the stacktrace.
But given we just can't allocate any memory at all, I think it's the
best we can do.

However, given there are enough normal (with allocation allowed) uses
of stackdepot with KASAN enabled, the chances of stackdepot having
exhausted its pool when calling this are small. The condition when
recording the stack with the _noalloc() variant would fail is:
stackdepot runs out of space AND the same stack trace has not been
recorded before. And the only time we'd notice this is if we actually
hit a kernel bug that KASAN wants to report. The aggregate probability
of all this happening is very very low.

The original series has some more explanation:
https://lkml.kernel.org/r/[email protected]

> > > > Well, the compiler does have an opinion:
> > > >
> > > > kernel/rcu/tree.c: In function ‘__call_rcu’:
> > > > kernel/rcu/tree.c:3029:2: error: implicit declaration of function ‘kasan_record_aux_stack_noalloc’; did you mean ‘kasan_record_aux_stack’? [-Werror=implicit-function-declaration]
> > > > 3029 | kasan_record_aux_stack_noalloc(head);
> > > > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > > | kasan_record_aux_stack
> > > >
> > > > I get the same message after merging in current mainline.
> > > >
> > > > I have therefore dropped this patch for the time being.
> > > >
> > > > Thanx, Paul
> > > Hi Paul E,
> > > The kasan_record_aux_stack_noalloc() is just introduce to linux-next now,
> > > and marking "Notice: this object is not reachable from any branch." in
> > > commit.
> > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/include/linux/kasan.h?h=next-20211029&id=2f64acf6b653d01fbdc92a693f12bbf71a205926

"Notice: this object is not reachable from any branch." because it
kept changing the hash since it's in the -mm tree.

> > That would explain it! Feel free to resend once the functionality is
> > more generally available.
>
> [email protected] mailing list
>
> I found the full commit with kasan_record_aux_stack_noalloc() implementation:
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20211029&id=2f64acf6b653d01fbdc92a693f12bbf71a205926
>
> but it calls kasan_save_stack() with second bool argument, and
> kasan_save_stack() accepts only 1 argument:
> https://elixir.bootlin.com/linux/latest/source/mm/kasan/common.c#L33
> so I am lost and can't comment on any of the Paul's questions re
> interrupts/spinlocks.

None of the kasan_record_aux_stack_noalloc() code and its dependencies
were in mainline yet. I think it landed a few days ago, but too late
for the patches here.