Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751638AbcKHNZv (ORCPT ); Tue, 8 Nov 2016 08:25:51 -0500 Received: from mail-ua0-f172.google.com ([209.85.217.172]:35836 "EHLO mail-ua0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751134AbcKHNZt (ORCPT ); Tue, 8 Nov 2016 08:25:49 -0500 MIME-Version: 1.0 In-Reply-To: <20161019111541.GQ29358@nuc-i3427.alporthouse.com> References: <1476773771-11470-1-git-send-email-hch@lst.de> <1476773771-11470-3-git-send-email-hch@lst.de> <20161019111541.GQ29358@nuc-i3427.alporthouse.com> From: Joel Fernandes Date: Tue, 8 Nov 2016 05:24:30 -0800 Message-ID: Subject: Re: [PATCH 2/6] mm: mark all calls into the vmalloc subsystem as potentially sleeping To: Chris Wilson Cc: Christoph Hellwig , Andrew Morton , Jisheng Zhang , John Dias , "open list:MEMORY MANAGEMENT" , linux-rt-users@vger.kernel.org, LKML Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2993 Lines: 58 On Wed, Oct 19, 2016 at 4:15 AM, Chris Wilson wrote: > On Tue, Oct 18, 2016 at 08:56:07AM +0200, Christoph Hellwig wrote: >> This is how everyone seems to already use them, but let's make that >> explicit. > > Ah, found an exception, vmapped stacks: > > [ 696.928541] BUG: sleeping function called from invalid context at mm/vmalloc.c:615 > [ 696.928576] in_atomic(): 1, irqs_disabled(): 0, pid: 30521, name: bash > [ 696.928590] 1 lock held by bash/30521: > [ 696.928600] #0: [ 696.928606] (vmap_area_lock[ 696.928619] ){+.+...}, at: [ 696.928640] [] __purge_vmap_area_lazy+0x30f/0x370 > [ 696.928656] CPU: 0 PID: 30521 Comm: bash Tainted: G W 4.9.0-rc1+ #124 > [ 696.928672] Hardware name: / , BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015 > [ 696.928690] ffffc900070f7c70 ffffffff812be1f5 ffff8802750b6680 ffffffff819650a6 > [ 696.928717] ffffc900070f7c98 ffffffff810a3216 0000000000004001 ffff8802726e16c0 > [ 696.928743] ffff8802726e19a0 ffffc900070f7d08 ffffffff8115f0f3 ffff8802750b6680 > [ 696.928768] Call Trace: > [ 696.928782] [] dump_stack+0x68/0x93 > [ 696.928796] [] ___might_sleep+0x166/0x220 > [ 696.928809] [] __purge_vmap_area_lazy+0x333/0x370 > [ 696.928823] [] ? vunmap_page_range+0x1e8/0x350 > [ 696.928837] [] free_vmap_area_noflush+0x83/0x90 > [ 696.928850] [] remove_vm_area+0x71/0xb0 > [ 696.928863] [] __vunmap+0x29/0xf0 > [ 696.928875] [] vfree+0x29/0x70 > [ 696.928888] [] put_task_stack+0x76/0x120 >From this traceback, it looks like the lock causing the atomic context was actually acquired in the vfree path itself, and not by the vmapped stack user (as it says "vmap_area_lock" held). I am still wondering why vmap_area_lock was held during the might_sleep(), perhaps you may not have applied all patches from Chris H? >From the patches I saw, vmap_area_lock is not acquired during any of the might_sleep Chris H added, but I may be missing something. In anycase looks to me like the atomicity is introduced by the vfree path itself and not the caller. Thanks! Joel > [ 696.928901] [] finish_task_switch+0x163/0x1e0 > [ 696.928914] [] ? finish_task_switch+0x65/0x1e0 > [ 696.928928] [] __schedule+0x1f5/0x7c0 > [ 696.928940] [] schedule+0x38/0x90 > [ 696.928953] [] do_wait+0x1d1/0x200 > [ 696.928966] [] SyS_wait4+0x61/0xc0 > [ 696.928979] [] ? task_stopped_code+0x50/0x50 > [ 696.928992] [] entry_SYSCALL_64_fastpath+0x1c/0xb1 > > [This was triggered by earlier patch to remove the serialisation and add > cond_resched_lock(&vmap_area_lock)] > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre