Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932623AbdC3WWi (ORCPT ); Thu, 30 Mar 2017 18:22:38 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:39424 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932201AbdC3WWg (ORCPT ); Thu, 30 Mar 2017 18:22:36 -0400 Date: Thu, 30 Mar 2017 15:22:29 -0700 From: Andrew Morton To: Andrey Ryabinin Cc: , , , , , , , , , , , , , , Subject: Re: [PATCH 1/4] mm/vmalloc: allow to call vfree() in atomic context Message-Id: <20170330152229.f2108e718114ed77acae7405@linux-foundation.org> In-Reply-To: <20170330102719.13119-1-aryabinin@virtuozzo.com> References: <20170330102719.13119-1-aryabinin@virtuozzo.com> X-Mailer: Sylpheed 3.4.1 (GTK+ 2.24.23; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2511 Lines: 66 On Thu, 30 Mar 2017 13:27:16 +0300 Andrey Ryabinin wrote: > Commit 5803ed292e63 ("mm: mark all calls into the vmalloc subsystem > as potentially sleeping") added might_sleep() to remove_vm_area() from > vfree(), and commit 763b218ddfaf ("mm: add preempt points into > __purge_vmap_area_lazy()") actually made vfree() potentially sleeping. > > This broke vmwgfx driver which calls vfree() under spin_lock(). > > BUG: sleeping function called from invalid context at mm/vmalloc.c:1480 > in_atomic(): 1, irqs_disabled(): 0, pid: 341, name: plymouthd > 2 locks held by plymouthd/341: > #0: (drm_global_mutex){+.+.+.}, at: [] drm_release+0x3b/0x3b0 [drm] > #1: (&(&tfile->lock)->rlock){+.+...}, at: [] ttm_object_file_release+0x28/0x90 [ttm] > > Call Trace: > dump_stack+0x86/0xc3 > ___might_sleep+0x17d/0x250 > __might_sleep+0x4a/0x80 > remove_vm_area+0x22/0x90 > __vunmap+0x2e/0x110 > vfree+0x42/0x90 > kvfree+0x2c/0x40 > drm_ht_remove+0x1a/0x30 [drm] > ttm_object_file_release+0x50/0x90 [ttm] > vmw_postclose+0x47/0x60 [vmwgfx] > drm_release+0x290/0x3b0 [drm] > __fput+0xf8/0x210 > ____fput+0xe/0x10 > task_work_run+0x85/0xc0 > exit_to_usermode_loop+0xb4/0xc0 > do_syscall_64+0x185/0x1f0 > entry_SYSCALL64_slow_path+0x25/0x25 > > This can be fixed in vmgfx, but it would be better to make vfree() > non-sleeping again because we may have other bugs like this one. I tend to disagree: adding yet another schedule_work() introduces additional overhead and adds some risk of ENOMEM errors which wouldn't occur with a synchronous free. > __purge_vmap_area_lazy() is the only function in the vfree() path that > wants to be able to sleep. So it make sense to schedule > __purge_vmap_area_lazy() via schedule_work() so it runs only in sleepable > context. vfree() already does if (unlikely(in_interrupt())) __vfree_deferred(addr); so it seems silly to introduce another defer-to-kernel-thread thing when we already have one. > This will have a minimal effect on the regular vfree() path. > since __purge_vmap_area_lazy() is rarely called. hum, OK, so perhaps the overhead isn't too bad. Remind me: where does __purge_vmap_area_lazy() sleep? Seems to me that a better fix would be to make vfree() atomic, if poss. Otherwise, to fix callers so they call vfree from sleepable context. That will reduce kernel latencies as well.