2016-12-12 05:36:07

by kernel test robot

[permalink] [raw]
Subject: [lkp-developer] [kernel/fork] cc639db4ac: BUG:using_smp_processor_id()in_preemptible

FYI, we noticed the following commit:

commit: cc639db4acfeb459f3dcec080c6cfe11e36266e0 ("kernel/fork: use vfree_atomic() to free thread stack")
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master

in testcase: iperf
with following parameters:

runtime: 300s
cluster: cs-localhost
protocol: tcp

test-description: iperf is a tool for active measurements of the maximum achievable bandwidth on IP networks.
test-url: https://iperf.fr/


on test machine: qemu-system-x86_64 -enable-kvm -cpu Nehalem -smp 2 -m 1G

caused below changes:


+-------------------------------------------------+------------+------------+
| | 1d2bc599c4 | cc639db4ac |
+-------------------------------------------------+------------+------------+
| boot_successes | 48 | 41 |
| boot_failures | 24 | 29 |
| invoked_oom-killer:gfp_mask=0x | 21 | 19 |
| Mem-Info | 21 | 19 |
| BUG:kernel_reboot-without-warning_in_test_stage | 3 | 2 |
| BUG:using_smp_processor_id()in_preemptible | 0 | 13 |
| calltrace:SyS_clone | 0 | 8 |
+-------------------------------------------------+------------+------------+



[ 36.609372]
[ 37.466792] skip http request: cgi-bin/lkp-jobfile-append-var?job_file=/lkp/scheduled/vm-vp-1G-7/iperf-300s-cs-localhost-tcp-debian-x86_64-2016-08-31.cgz-cc639db4acfeb459f3dcec080c6cfe11e36266e0-20161209-43576-1ch14l2-25.yaml&job_state=running -o /dev/null
[ 37.470889]
[ 38.354206] BUG: using smp_processor_id() in preemptible [00000000] code: iperf-300s-cs-l/277
[ 38.364617] caller is debug_smp_processor_id+0x17/0x19
[ 38.369836] CPU: 1 PID: 277 Comm: iperf-300s-cs-l Not tainted 4.9.0-rc8-00140-gcc639db #2
[ 38.371241] ffffc900003f3cf0 ffffffff8123ae6f 0000000000000001 ffffffff818181da
[ 38.372656] ffffc900003f3d20 ffffffff81252f41 0000000000012de0 00000000fffffdff
[ 38.375556] ffff880009328f40 ffff88000592c400 ffffc900003f3d30 ffffffff81252f6a
[ 38.381015] Call Trace:
[ 38.381439] [<ffffffff8123ae6f>] dump_stack+0x9a/0xd0
[ 38.382302] [<ffffffff81252f41>] check_preemption_disabled+0xdd/0xef
[ 38.383365] [<ffffffff81252f6a>] debug_smp_processor_id+0x17/0x19
[ 38.384389] [<ffffffff811796df>] __vfree_deferred+0x16/0x4c
[ 38.389428] [<ffffffff8117b584>] vfree_atomic+0x22/0x24
[ 38.390334] [<ffffffff81094f5d>] free_thread_stack+0xc2/0x106
[ 38.391330] [<ffffffff810951be>] put_task_stack+0x4c/0x62
[ 38.392239] [<ffffffff81095f81>] copy_process+0x7e0/0x16e8
[ 38.397439] [<ffffffff8109702d>] _do_fork+0xbb/0x2d3
[ 38.398285] [<ffffffff810465e8>] ? __do_page_fault+0x2e1/0x384
[ 38.399275] [<ffffffff8112633f>] ? trace_hardirqs_off_caller+0x12/0x24
[ 38.400382] [<ffffffff810972cb>] SyS_clone+0x19/0x1b
[ 38.405409] [<ffffffff81003800>] do_syscall_64+0x143/0x173
[ 38.406340] [<ffffffff81507289>] entry_SYSCALL64_slow_path+0x25/0x25
[ 39.757101] {
[ 39.757377]


To reproduce:

git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email



Thanks,
Ying Huang


Attachments:
(No filename) (3.33 kB)
config-4.9.0-rc8-00140-gcc639db (93.11 kB)
job-script (6.23 kB)
dmesg.xz (10.67 kB)
Download all attachments

2016-12-12 15:02:00

by Andrey Ryabinin

[permalink] [raw]
Subject: [PATCH] mm-add-vfree_atomic-fix

DEBUG_PREEMPT complains about using this_cpu_ptr() in preemptible:
BUG: using smp_processor_id() in preemptible [00000000] code: iperf-300s-cs-l/277
caller is debug_smp_processor_id+0x17/0x19
CPU: 1 PID: 277 Comm: iperf-300s-cs-l Not tainted 4.9.0-rc8-00140-gcc639db #2
ffffc900003f3cf0 ffffffff8123ae6f 0000000000000001 ffffffff818181da
ffffc900003f3d20 ffffffff81252f41 0000000000012de0 00000000fffffdff
ffff880009328f40 ffff88000592c400 ffffc900003f3d30 ffffffff81252f6a
Call Trace:
[<ffffffff8123ae6f>] dump_stack+0x9a/0xd0
[<ffffffff81252f41>] check_preemption_disabled+0xdd/0xef
[<ffffffff81252f6a>] debug_smp_processor_id+0x17/0x19
[<ffffffff811796df>] __vfree_deferred+0x16/0x4c
[<ffffffff8117b584>] vfree_atomic+0x22/0x24
[<ffffffff81094f5d>] free_thread_stack+0xc2/0x106
[<ffffffff810951be>] put_task_stack+0x4c/0x62
[<ffffffff81095f81>] copy_process+0x7e0/0x16e8
[<ffffffff8109702d>] _do_fork+0xbb/0x2d3
[<ffffffff810465e8>] ? __do_page_fault+0x2e1/0x384
[<ffffffff8112633f>] ? trace_hardirqs_off_caller+0x12/0x24
[<ffffffff810972cb>] SyS_clone+0x19/0x1b
[<ffffffff81003800>] do_syscall_64+0x143/0x173
[<ffffffff81507289>] entry_SYSCALL64_slow_path+0x25/0x25

Use raw_cpu_ptr() instead of this_cpu_ptr() to hide this warning.
It's fine because llist_add() implementation is lock-less, so it works even
if we adding to the list of some other cpu. schedule_work() is also preempt-safe.

Reported-by: kernel test robot <[email protected]>
Signed-off-by: Andrey Ryabinin <[email protected]>
---
mm/vmalloc.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 43f0608..d8813963 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1498,7 +1498,14 @@ static void __vunmap(const void *addr, int deallocate_pages)

static inline void __vfree_deferred(const void *addr)
{
- struct vfree_deferred *p = this_cpu_ptr(&vfree_deferred);
+ /*
+ * Use raw_cpu_ptr() because this can be called from preemptible
+ * context. Preemption is absolutely fine here, because llist_add()
+ * implementation is lockless, so it works even if we adding to list
+ * of the other cpu.
+ * schedule_work() should be fine with this too.
+ */
+ struct vfree_deferred *p = raw_cpu_ptr(&vfree_deferred);

if (llist_add((struct llist_node *)addr, &p->list))
schedule_work(&p->wq);
--
2.7.3

2016-12-13 10:12:58

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] mm-add-vfree_atomic-fix

[CC Andy]

I've noticed the same
http://lkml.kernel.org/r/[email protected]
and also concluded same as you

On Mon 12-12-16 17:46:21, Andrey Ryabinin wrote:
> DEBUG_PREEMPT complains about using this_cpu_ptr() in preemptible:
> BUG: using smp_processor_id() in preemptible [00000000] code: iperf-300s-cs-l/277
> caller is debug_smp_processor_id+0x17/0x19
> CPU: 1 PID: 277 Comm: iperf-300s-cs-l Not tainted 4.9.0-rc8-00140-gcc639db #2
> ffffc900003f3cf0 ffffffff8123ae6f 0000000000000001 ffffffff818181da
> ffffc900003f3d20 ffffffff81252f41 0000000000012de0 00000000fffffdff
> ffff880009328f40 ffff88000592c400 ffffc900003f3d30 ffffffff81252f6a
> Call Trace:
> [<ffffffff8123ae6f>] dump_stack+0x9a/0xd0
> [<ffffffff81252f41>] check_preemption_disabled+0xdd/0xef
> [<ffffffff81252f6a>] debug_smp_processor_id+0x17/0x19
> [<ffffffff811796df>] __vfree_deferred+0x16/0x4c
> [<ffffffff8117b584>] vfree_atomic+0x22/0x24
> [<ffffffff81094f5d>] free_thread_stack+0xc2/0x106
> [<ffffffff810951be>] put_task_stack+0x4c/0x62
> [<ffffffff81095f81>] copy_process+0x7e0/0x16e8
> [<ffffffff8109702d>] _do_fork+0xbb/0x2d3
> [<ffffffff810465e8>] ? __do_page_fault+0x2e1/0x384
> [<ffffffff8112633f>] ? trace_hardirqs_off_caller+0x12/0x24
> [<ffffffff810972cb>] SyS_clone+0x19/0x1b
> [<ffffffff81003800>] do_syscall_64+0x143/0x173
> [<ffffffff81507289>] entry_SYSCALL64_slow_path+0x25/0x25
>
> Use raw_cpu_ptr() instead of this_cpu_ptr() to hide this warning.
> It's fine because llist_add() implementation is lock-less, so it works even
> if we adding to the list of some other cpu. schedule_work() is also preempt-safe.
>
> Reported-by: kernel test robot <[email protected]>
> Signed-off-by: Andrey Ryabinin <[email protected]>

Acked-by: Michal Hocko <[email protected]>

> ---
> mm/vmalloc.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 43f0608..d8813963 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1498,7 +1498,14 @@ static void __vunmap(const void *addr, int deallocate_pages)
>
> static inline void __vfree_deferred(const void *addr)
> {
> - struct vfree_deferred *p = this_cpu_ptr(&vfree_deferred);
> + /*
> + * Use raw_cpu_ptr() because this can be called from preemptible
> + * context. Preemption is absolutely fine here, because llist_add()
> + * implementation is lockless, so it works even if we adding to list
> + * of the other cpu.
> + * schedule_work() should be fine with this too.
> + */
> + struct vfree_deferred *p = raw_cpu_ptr(&vfree_deferred);
>
> if (llist_add((struct llist_node *)addr, &p->list))
> schedule_work(&p->wq);
> --
> 2.7.3

--
Michal Hocko
SUSE Labs

2016-12-13 16:58:35

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH] mm-add-vfree_atomic-fix

On Tue, Dec 13, 2016 at 2:12 AM, Michal Hocko <[email protected]> wrote:
> [CC Andy]
>
> I've noticed the same
> http://lkml.kernel.org/r/[email protected]
> and also concluded same as you
>
> On Mon 12-12-16 17:46:21, Andrey Ryabinin wrote:
>> DEBUG_PREEMPT complains about using this_cpu_ptr() in preemptible:
>> BUG: using smp_processor_id() in preemptible [00000000] code: iperf-300s-cs-l/277
>> caller is debug_smp_processor_id+0x17/0x19
>> CPU: 1 PID: 277 Comm: iperf-300s-cs-l Not tainted 4.9.0-rc8-00140-gcc639db #2
>> ffffc900003f3cf0 ffffffff8123ae6f 0000000000000001 ffffffff818181da
>> ffffc900003f3d20 ffffffff81252f41 0000000000012de0 00000000fffffdff
>> ffff880009328f40 ffff88000592c400 ffffc900003f3d30 ffffffff81252f6a
>> Call Trace:
>> [<ffffffff8123ae6f>] dump_stack+0x9a/0xd0
>> [<ffffffff81252f41>] check_preemption_disabled+0xdd/0xef
>> [<ffffffff81252f6a>] debug_smp_processor_id+0x17/0x19
>> [<ffffffff811796df>] __vfree_deferred+0x16/0x4c
>> [<ffffffff8117b584>] vfree_atomic+0x22/0x24
>> [<ffffffff81094f5d>] free_thread_stack+0xc2/0x106
>> [<ffffffff810951be>] put_task_stack+0x4c/0x62
>> [<ffffffff81095f81>] copy_process+0x7e0/0x16e8
>> [<ffffffff8109702d>] _do_fork+0xbb/0x2d3
>> [<ffffffff810465e8>] ? __do_page_fault+0x2e1/0x384
>> [<ffffffff8112633f>] ? trace_hardirqs_off_caller+0x12/0x24
>> [<ffffffff810972cb>] SyS_clone+0x19/0x1b
>> [<ffffffff81003800>] do_syscall_64+0x143/0x173
>> [<ffffffff81507289>] entry_SYSCALL64_slow_path+0x25/0x25
>>
>> Use raw_cpu_ptr() instead of this_cpu_ptr() to hide this warning.
>> It's fine because llist_add() implementation is lock-less, so it works even
>> if we adding to the list of some other cpu. schedule_work() is also preempt-safe.
>>
>> Reported-by: kernel test robot <[email protected]>
>> Signed-off-by: Andrey Ryabinin <[email protected]>
>
> Acked-by: Michal Hocko <[email protected]>

But not quite acked by me. What happened to the vfree code that
causes vfree_deferred to be called in a preemptable context? That
sounds like a bug.

(This code doesn't exist in Linus' tree. What tree does this apply to.)

>
>> ---
>> mm/vmalloc.c | 9 ++++++++-
>> 1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>> index 43f0608..d8813963 100644
>> --- a/mm/vmalloc.c
>> +++ b/mm/vmalloc.c
>> @@ -1498,7 +1498,14 @@ static void __vunmap(const void *addr, int deallocate_pages)
>>
>> static inline void __vfree_deferred(const void *addr)
>> {
>> - struct vfree_deferred *p = this_cpu_ptr(&vfree_deferred);
>> + /*
>> + * Use raw_cpu_ptr() because this can be called from preemptible
>> + * context. Preemption is absolutely fine here, because llist_add()
>> + * implementation is lockless, so it works even if we adding to list
>> + * of the other cpu.
>> + * schedule_work() should be fine with this too.
>> + */
>> + struct vfree_deferred *p = raw_cpu_ptr(&vfree_deferred);
>>
>> if (llist_add((struct llist_node *)addr, &p->list))
>> schedule_work(&p->wq);
>> --
>> 2.7.3
>
> --
> Michal Hocko
> SUSE Labs



--
Andy Lutomirski
AMA Capital Management, LLC

2016-12-13 17:39:58

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] mm-add-vfree_atomic-fix

On Tue 13-12-16 08:57:34, Andy Lutomirski wrote:
> On Tue, Dec 13, 2016 at 2:12 AM, Michal Hocko <[email protected]> wrote:
> > [CC Andy]
> >
> > I've noticed the same
> > http://lkml.kernel.org/r/[email protected]
> > and also concluded same as you
> >
> > On Mon 12-12-16 17:46:21, Andrey Ryabinin wrote:
> >> DEBUG_PREEMPT complains about using this_cpu_ptr() in preemptible:
> >> BUG: using smp_processor_id() in preemptible [00000000] code: iperf-300s-cs-l/277
> >> caller is debug_smp_processor_id+0x17/0x19
> >> CPU: 1 PID: 277 Comm: iperf-300s-cs-l Not tainted 4.9.0-rc8-00140-gcc639db #2
> >> ffffc900003f3cf0 ffffffff8123ae6f 0000000000000001 ffffffff818181da
> >> ffffc900003f3d20 ffffffff81252f41 0000000000012de0 00000000fffffdff
> >> ffff880009328f40 ffff88000592c400 ffffc900003f3d30 ffffffff81252f6a
> >> Call Trace:
> >> [<ffffffff8123ae6f>] dump_stack+0x9a/0xd0
> >> [<ffffffff81252f41>] check_preemption_disabled+0xdd/0xef
> >> [<ffffffff81252f6a>] debug_smp_processor_id+0x17/0x19
> >> [<ffffffff811796df>] __vfree_deferred+0x16/0x4c
> >> [<ffffffff8117b584>] vfree_atomic+0x22/0x24
> >> [<ffffffff81094f5d>] free_thread_stack+0xc2/0x106
> >> [<ffffffff810951be>] put_task_stack+0x4c/0x62
> >> [<ffffffff81095f81>] copy_process+0x7e0/0x16e8
> >> [<ffffffff8109702d>] _do_fork+0xbb/0x2d3
> >> [<ffffffff810465e8>] ? __do_page_fault+0x2e1/0x384
> >> [<ffffffff8112633f>] ? trace_hardirqs_off_caller+0x12/0x24
> >> [<ffffffff810972cb>] SyS_clone+0x19/0x1b
> >> [<ffffffff81003800>] do_syscall_64+0x143/0x173
> >> [<ffffffff81507289>] entry_SYSCALL64_slow_path+0x25/0x25
> >>
> >> Use raw_cpu_ptr() instead of this_cpu_ptr() to hide this warning.
> >> It's fine because llist_add() implementation is lock-less, so it works even
> >> if we adding to the list of some other cpu. schedule_work() is also preempt-safe.
> >>
> >> Reported-by: kernel test robot <[email protected]>
> >> Signed-off-by: Andrey Ryabinin <[email protected]>
> >
> > Acked-by: Michal Hocko <[email protected]>
>
> But not quite acked by me. What happened to the vfree code that
> causes vfree_deferred to be called in a preemptable context? That
> sounds like a bug.

Not sure I understand but the above stack points to a preemptible
context (copy_process). My stack was different and it looks preemptible as well.
free_thread_stack calls vfree_atomic unconditionally. So I am not sure
why do you think this is a bug?

> (This code doesn't exist in Linus' tree. What tree does this apply to.)

Anyway, now that I am looking at Andrew's tree I can see [1] which
doesn't have this_cpu_ptr. So I am not sure where this this_cpu_ptr came
from. Maybe the previous version of the patch which has shown up in the
linux-next and Andrew has picked up [2] in the meantime. /me confused

[1] http://www.ozlabs.org/~akpm/mmotm/broken-out/mm-add-vfree_atomic.patch
[2] http://lkml.kernel.org/r/[email protected]

> >
> >> ---
> >> mm/vmalloc.c | 9 ++++++++-
> >> 1 file changed, 8 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> >> index 43f0608..d8813963 100644
> >> --- a/mm/vmalloc.c
> >> +++ b/mm/vmalloc.c
> >> @@ -1498,7 +1498,14 @@ static void __vunmap(const void *addr, int deallocate_pages)
> >>
> >> static inline void __vfree_deferred(const void *addr)
> >> {
> >> - struct vfree_deferred *p = this_cpu_ptr(&vfree_deferred);
> >> + /*
> >> + * Use raw_cpu_ptr() because this can be called from preemptible
> >> + * context. Preemption is absolutely fine here, because llist_add()
> >> + * implementation is lockless, so it works even if we adding to list
> >> + * of the other cpu.
> >> + * schedule_work() should be fine with this too.
> >> + */
> >> + struct vfree_deferred *p = raw_cpu_ptr(&vfree_deferred);
> >>
> >> if (llist_add((struct llist_node *)addr, &p->list))
> >> schedule_work(&p->wq);
> >> --
> >> 2.7.3
> >
> > --
> > Michal Hocko
> > SUSE Labs
>
>
>
> --
> Andy Lutomirski
> AMA Capital Management, LLC

--
Michal Hocko
SUSE Labs

2016-12-13 18:16:15

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH] mm-add-vfree_atomic-fix

On Tue, Dec 13, 2016 at 9:24 AM, Michal Hocko <[email protected]> wrote:
> On Tue 13-12-16 08:57:34, Andy Lutomirski wrote:
>> On Tue, Dec 13, 2016 at 2:12 AM, Michal Hocko <[email protected]> wrote:
>> > [CC Andy]
>> >
>> > I've noticed the same
>> > http://lkml.kernel.org/r/[email protected]
>> > and also concluded same as you
>> >
>> > On Mon 12-12-16 17:46:21, Andrey Ryabinin wrote:
>> >> DEBUG_PREEMPT complains about using this_cpu_ptr() in preemptible:
>> >> BUG: using smp_processor_id() in preemptible [00000000] code: iperf-300s-cs-l/277
>> >> caller is debug_smp_processor_id+0x17/0x19
>> >> CPU: 1 PID: 277 Comm: iperf-300s-cs-l Not tainted 4.9.0-rc8-00140-gcc639db #2
>> >> ffffc900003f3cf0 ffffffff8123ae6f 0000000000000001 ffffffff818181da
>> >> ffffc900003f3d20 ffffffff81252f41 0000000000012de0 00000000fffffdff
>> >> ffff880009328f40 ffff88000592c400 ffffc900003f3d30 ffffffff81252f6a
>> >> Call Trace:
>> >> [<ffffffff8123ae6f>] dump_stack+0x9a/0xd0
>> >> [<ffffffff81252f41>] check_preemption_disabled+0xdd/0xef
>> >> [<ffffffff81252f6a>] debug_smp_processor_id+0x17/0x19
>> >> [<ffffffff811796df>] __vfree_deferred+0x16/0x4c
>> >> [<ffffffff8117b584>] vfree_atomic+0x22/0x24
>> >> [<ffffffff81094f5d>] free_thread_stack+0xc2/0x106
>> >> [<ffffffff810951be>] put_task_stack+0x4c/0x62
>> >> [<ffffffff81095f81>] copy_process+0x7e0/0x16e8
>> >> [<ffffffff8109702d>] _do_fork+0xbb/0x2d3
>> >> [<ffffffff810465e8>] ? __do_page_fault+0x2e1/0x384
>> >> [<ffffffff8112633f>] ? trace_hardirqs_off_caller+0x12/0x24
>> >> [<ffffffff810972cb>] SyS_clone+0x19/0x1b
>> >> [<ffffffff81003800>] do_syscall_64+0x143/0x173
>> >> [<ffffffff81507289>] entry_SYSCALL64_slow_path+0x25/0x25
>> >>
>> >> Use raw_cpu_ptr() instead of this_cpu_ptr() to hide this warning.
>> >> It's fine because llist_add() implementation is lock-less, so it works even
>> >> if we adding to the list of some other cpu. schedule_work() is also preempt-safe.
>> >>
>> >> Reported-by: kernel test robot <[email protected]>
>> >> Signed-off-by: Andrey Ryabinin <[email protected]>
>> >
>> > Acked-by: Michal Hocko <[email protected]>
>>
>> But not quite acked by me. What happened to the vfree code that
>> causes vfree_deferred to be called in a preemptable context? That
>> sounds like a bug.
>
> Not sure I understand but the above stack points to a preemptible
> context (copy_process). My stack was different and it looks preemptible as well.
> free_thread_stack calls vfree_atomic unconditionally. So I am not sure
> why do you think this is a bug?
>
>> (This code doesn't exist in Linus' tree. What tree does this apply to.)
>
> Anyway, now that I am looking at Andrew's tree I can see [1] which
> doesn't have this_cpu_ptr. So I am not sure where this this_cpu_ptr came
> from. Maybe the previous version of the patch which has shown up in the
> linux-next and Andrew has picked up [2] in the meantime. /me confused
>
> [1] http://www.ozlabs.org/~akpm/mmotm/broken-out/mm-add-vfree_atomic.patch
> [2] http://lkml.kernel.org/r/[email protected]

The underlying issue seems to be that we have this shiny new function
vfree_atomic() which doesn't work in *non-atomic* context and that we
have "kernel/fork: use vfree_atomic() to free thread stack" that calls
vfree_atomic() from non-atomic context.

I'm not sure what the motivation of the latter patch was, but ISTM we
should revert it. TBH I'm not quite sure what the purpose of
splitting vfree() and vfree_atomic() was, but I'm not seeing any
reason that the common case of freeing stacks from non-atomic context
should defer the free instead of just doing it right away.

Andrey, Johannes, why should task stack freeing use vfree_atomic() in
the first place?

2016-12-13 19:12:50

by Andrey Ryabinin

[permalink] [raw]
Subject: Re: [PATCH] mm-add-vfree_atomic-fix



On 12/13/2016 08:24 PM, Michal Hocko wrote:
> On Tue 13-12-16 08:57:34, Andy Lutomirski wrote:
>> On Tue, Dec 13, 2016 at 2:12 AM, Michal Hocko <[email protected]> wrote:
>>> [CC Andy]
>>>
>>> I've noticed the same
>>> http://lkml.kernel.org/r/[email protected]
>>> and also concluded same as you
>>>
>>> On Mon 12-12-16 17:46:21, Andrey Ryabinin wrote:
>>>> DEBUG_PREEMPT complains about using this_cpu_ptr() in preemptible:
>>>> BUG: using smp_processor_id() in preemptible [00000000] code: iperf-300s-cs-l/277
>>>> caller is debug_smp_processor_id+0x17/0x19
>>>> CPU: 1 PID: 277 Comm: iperf-300s-cs-l Not tainted 4.9.0-rc8-00140-gcc639db #2
>>>> ffffc900003f3cf0 ffffffff8123ae6f 0000000000000001 ffffffff818181da
>>>> ffffc900003f3d20 ffffffff81252f41 0000000000012de0 00000000fffffdff
>>>> ffff880009328f40 ffff88000592c400 ffffc900003f3d30 ffffffff81252f6a
>>>> Call Trace:
>>>> [<ffffffff8123ae6f>] dump_stack+0x9a/0xd0
>>>> [<ffffffff81252f41>] check_preemption_disabled+0xdd/0xef
>>>> [<ffffffff81252f6a>] debug_smp_processor_id+0x17/0x19
>>>> [<ffffffff811796df>] __vfree_deferred+0x16/0x4c
>>>> [<ffffffff8117b584>] vfree_atomic+0x22/0x24
>>>> [<ffffffff81094f5d>] free_thread_stack+0xc2/0x106
>>>> [<ffffffff810951be>] put_task_stack+0x4c/0x62
>>>> [<ffffffff81095f81>] copy_process+0x7e0/0x16e8
>>>> [<ffffffff8109702d>] _do_fork+0xbb/0x2d3
>>>> [<ffffffff810465e8>] ? __do_page_fault+0x2e1/0x384
>>>> [<ffffffff8112633f>] ? trace_hardirqs_off_caller+0x12/0x24
>>>> [<ffffffff810972cb>] SyS_clone+0x19/0x1b
>>>> [<ffffffff81003800>] do_syscall_64+0x143/0x173
>>>> [<ffffffff81507289>] entry_SYSCALL64_slow_path+0x25/0x25
>>>>
>>>> Use raw_cpu_ptr() instead of this_cpu_ptr() to hide this warning.
>>>> It's fine because llist_add() implementation is lock-less, so it works even
>>>> if we adding to the list of some other cpu. schedule_work() is also preempt-safe.
>>>>
>>>> Reported-by: kernel test robot <[email protected]>
>>>> Signed-off-by: Andrey Ryabinin <[email protected]>
>>>
>>> Acked-by: Michal Hocko <[email protected]>
>>
>> But not quite acked by me. What happened to the vfree code that
>> causes vfree_deferred to be called in a preemptable context? That
>> sounds like a bug.
>
> Not sure I understand but the above stack points to a preemptible
> context (copy_process). My stack was different and it looks preemptible as well.
> free_thread_stack calls vfree_atomic unconditionally. So I am not sure
> why do you think this is a bug?
>
>> (This code doesn't exist in Linus' tree. What tree does this apply to.)
>
> Anyway, now that I am looking at Andrew's tree I can see [1] which
> doesn't have this_cpu_ptr. So I am not sure where this this_cpu_ptr came
> from. Maybe the previous version of the patch which has shown up in the
> linux-next and Andrew has picked up [2] in the meantime. /me confused
>

this_cpu_ptr() comes from the original patch http://lkml.kernel.org/r/[email protected]
Andrew picked up [2] and folded-merged it into original patch and sent it to Linus.
Now it in Linus tree, commit bf22e37a641327e34681b7b6959d9646e3886


> [1] http://www.ozlabs.org/~akpm/mmotm/broken-out/mm-add-vfree_atomic.patch
> [2] http://lkml.kernel.org/r/[email protected]
>

2016-12-13 20:01:32

by Andrey Ryabinin

[permalink] [raw]
Subject: Re: [PATCH] mm-add-vfree_atomic-fix

On 12/13/2016 09:15 PM, Andy Lutomirski wrote:


>>>
>>> But not quite acked by me. What happened to the vfree code that
>>> causes vfree_deferred to be called in a preemptable context? That
>>> sounds like a bug.
>>
>> Not sure I understand but the above stack points to a preemptible
>> context (copy_process). My stack was different and it looks preemptible as well.
>> free_thread_stack calls vfree_atomic unconditionally. So I am not sure
>> why do you think this is a bug?
>>
>>> (This code doesn't exist in Linus' tree. What tree does this apply to.)
>>
>> Anyway, now that I am looking at Andrew's tree I can see [1] which
>> doesn't have this_cpu_ptr. So I am not sure where this this_cpu_ptr came
>> from. Maybe the previous version of the patch which has shown up in the
>> linux-next and Andrew has picked up [2] in the meantime. /me confused
>>
>> [1] http://www.ozlabs.org/~akpm/mmotm/broken-out/mm-add-vfree_atomic.patch
>> [2] http://lkml.kernel.org/r/[email protected]
>
> The underlying issue seems to be that we have this shiny new function
> vfree_atomic() which doesn't work in *non-atomic* context and that we

It does work non-atomic context. It's fixed now.

> have "kernel/fork: use vfree_atomic() to free thread stack" that calls
> vfree_atomic() from non-atomic context.

>From both context actually. Usually task stack is freed from atomic context:
http://lkml.kernel.org/r/[email protected]
http://lkml.kernel.org/r/CALCETrVqjejgpQVUdem8RK3uxdEgfOZy4cOJqJQjCLtBDnJfyQ@mail.gmail.com

On rare occasions it can be freed from non-atomic context, e.g. error path in copy_process().

> I'm not sure what the motivation of the latter patch was, but ISTM we
> should revert it. TBH I'm not quite sure what the purpose of
> splitting vfree() and vfree_atomic() was, but I'm not seeing any
> reason that the common case of freeing stacks from non-atomic context
> should defer the free instead of just doing it right away.
>
> Andrey, Johannes, why should task stack freeing use vfree_atomic() in
> the first place?

Because vfree() now can sleep and task stack freeing usually done in atomic context.


2016-12-14 03:02:46

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH] mm-add-vfree_atomic-fix

On Tue, Dec 13, 2016 at 11:21 AM, Andrey Ryabinin
<[email protected]> wrote:
> On 12/13/2016 09:15 PM, Andy Lutomirski wrote:
>
>
>>>>
>>>> But not quite acked by me. What happened to the vfree code that
>>>> causes vfree_deferred to be called in a preemptable context? That
>>>> sounds like a bug.
>>>
>>> Not sure I understand but the above stack points to a preemptible
>>> context (copy_process). My stack was different and it looks preemptible as well.
>>> free_thread_stack calls vfree_atomic unconditionally. So I am not sure
>>> why do you think this is a bug?
>>>
>>>> (This code doesn't exist in Linus' tree. What tree does this apply to.)
>>>
>>> Anyway, now that I am looking at Andrew's tree I can see [1] which
>>> doesn't have this_cpu_ptr. So I am not sure where this this_cpu_ptr came
>>> from. Maybe the previous version of the patch which has shown up in the
>>> linux-next and Andrew has picked up [2] in the meantime. /me confused
>>>
>>> [1] http://www.ozlabs.org/~akpm/mmotm/broken-out/mm-add-vfree_atomic.patch
>>> [2] http://lkml.kernel.org/r/[email protected]
>>
>> The underlying issue seems to be that we have this shiny new function
>> vfree_atomic() which doesn't work in *non-atomic* context and that we
>
> It does work non-atomic context. It's fixed now.
>
>> have "kernel/fork: use vfree_atomic() to free thread stack" that calls
>> vfree_atomic() from non-atomic context.
>
> From both context actually. Usually task stack is freed from atomic context:
> http://lkml.kernel.org/r/[email protected]
> http://lkml.kernel.org/r/CALCETrVqjejgpQVUdem8RK3uxdEgfOZy4cOJqJQjCLtBDnJfyQ@mail.gmail.com
>
> On rare occasions it can be freed from non-atomic context, e.g. error path in copy_process().
>
>> I'm not sure what the motivation of the latter patch was, but ISTM we
>> should revert it. TBH I'm not quite sure what the purpose of
>> splitting vfree() and vfree_atomic() was, but I'm not seeing any
>> reason that the common case of freeing stacks from non-atomic context
>> should defer the free instead of just doing it right away.
>>
>> Andrey, Johannes, why should task stack freeing use vfree_atomic() in
>> the first place?
>
> Because vfree() now can sleep and task stack freeing usually done in atomic context.
>
>

Fair enough.

--
Andy Lutomirski
AMA Capital Management, LLC