Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S964810AbcLMSQP (ORCPT ); Tue, 13 Dec 2016 13:16:15 -0500 Received: from mail-vk0-f52.google.com ([209.85.213.52]:35244 "EHLO mail-vk0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934707AbcLMSQF (ORCPT ); Tue, 13 Dec 2016 13:16:05 -0500 MIME-Version: 1.0 In-Reply-To: <20161213172441.GA22610@dhcp22.suse.cz> References: <87lgvlzp34.fsf@yhuang-dev.intel.com> <1481553981-3856-1-git-send-email-aryabinin@virtuozzo.com> <20161213101254.GC10498@dhcp22.suse.cz> <20161213172441.GA22610@dhcp22.suse.cz> From: Andy Lutomirski Date: Tue, 13 Dec 2016 10:15:35 -0800 Message-ID: Subject: Re: [PATCH] mm-add-vfree_atomic-fix To: Michal Hocko , Andrey Ryabinin , Johannes Weiner Cc: Andrew Morton , Huang Ying , Stephen Rothwell , Christoph Hellwig , Joel Fernandes , Jisheng Zhang , Chris Wilson , John Dias , Thomas Gleixner , "H. Peter Anvin" , Ingo Molnar , "linux-kernel@vger.kernel.org" , LKP Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3915 Lines: 74 On Tue, Dec 13, 2016 at 9:24 AM, Michal Hocko wrote: > On Tue 13-12-16 08:57:34, Andy Lutomirski wrote: >> On Tue, Dec 13, 2016 at 2:12 AM, Michal Hocko wrote: >> > [CC Andy] >> > >> > I've noticed the same >> > http://lkml.kernel.org/r/20161209142820.GA4334@dhcp22.suse.cz >> > and also concluded same as you >> > >> > On Mon 12-12-16 17:46:21, Andrey Ryabinin wrote: >> >> DEBUG_PREEMPT complains about using this_cpu_ptr() in preemptible: >> >> BUG: using smp_processor_id() in preemptible [00000000] code: iperf-300s-cs-l/277 >> >> caller is debug_smp_processor_id+0x17/0x19 >> >> CPU: 1 PID: 277 Comm: iperf-300s-cs-l Not tainted 4.9.0-rc8-00140-gcc639db #2 >> >> ffffc900003f3cf0 ffffffff8123ae6f 0000000000000001 ffffffff818181da >> >> ffffc900003f3d20 ffffffff81252f41 0000000000012de0 00000000fffffdff >> >> ffff880009328f40 ffff88000592c400 ffffc900003f3d30 ffffffff81252f6a >> >> Call Trace: >> >> [] dump_stack+0x9a/0xd0 >> >> [] check_preemption_disabled+0xdd/0xef >> >> [] debug_smp_processor_id+0x17/0x19 >> >> [] __vfree_deferred+0x16/0x4c >> >> [] vfree_atomic+0x22/0x24 >> >> [] free_thread_stack+0xc2/0x106 >> >> [] put_task_stack+0x4c/0x62 >> >> [] copy_process+0x7e0/0x16e8 >> >> [] _do_fork+0xbb/0x2d3 >> >> [] ? __do_page_fault+0x2e1/0x384 >> >> [] ? trace_hardirqs_off_caller+0x12/0x24 >> >> [] SyS_clone+0x19/0x1b >> >> [] do_syscall_64+0x143/0x173 >> >> [] entry_SYSCALL64_slow_path+0x25/0x25 >> >> >> >> Use raw_cpu_ptr() instead of this_cpu_ptr() to hide this warning. >> >> It's fine because llist_add() implementation is lock-less, so it works even >> >> if we adding to the list of some other cpu. schedule_work() is also preempt-safe. >> >> >> >> Reported-by: kernel test robot >> >> Signed-off-by: Andrey Ryabinin >> > >> > Acked-by: Michal Hocko >> >> But not quite acked by me. What happened to the vfree code that >> causes vfree_deferred to be called in a preemptable context? That >> sounds like a bug. > > Not sure I understand but the above stack points to a preemptible > context (copy_process). My stack was different and it looks preemptible as well. > free_thread_stack calls vfree_atomic unconditionally. So I am not sure > why do you think this is a bug? > >> (This code doesn't exist in Linus' tree. What tree does this apply to.) > > Anyway, now that I am looking at Andrew's tree I can see [1] which > doesn't have this_cpu_ptr. So I am not sure where this this_cpu_ptr came > from. Maybe the previous version of the patch which has shown up in the > linux-next and Andrew has picked up [2] in the meantime. /me confused > > [1] http://www.ozlabs.org/~akpm/mmotm/broken-out/mm-add-vfree_atomic.patch > [2] http://lkml.kernel.org/r/1481553981-3856-1-git-send-email-aryabinin@virtuozzo.com The underlying issue seems to be that we have this shiny new function vfree_atomic() which doesn't work in *non-atomic* context and that we have "kernel/fork: use vfree_atomic() to free thread stack" that calls vfree_atomic() from non-atomic context. I'm not sure what the motivation of the latter patch was, but ISTM we should revert it. TBH I'm not quite sure what the purpose of splitting vfree() and vfree_atomic() was, but I'm not seeing any reason that the common case of freeing stacks from non-atomic context should defer the free instead of just doing it right away. Andrey, Johannes, why should task stack freeing use vfree_atomic() in the first place?