Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752819AbbGNWpo (ORCPT ); Tue, 14 Jul 2015 18:45:44 -0400 Received: from mail-ie0-f174.google.com ([209.85.223.174]:34613 "EHLO mail-ie0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752666AbbGNWpn (ORCPT ); Tue, 14 Jul 2015 18:45:43 -0400 Date: Tue, 14 Jul 2015 15:45:40 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Dave Chinner cc: Mike Snitzer , Mikulas Patocka , Edward Thornber , linux-kernel@vger.kernel.org, linux-mm@kvack.org, dm-devel@redhat.com, Vivek Goyal , Andrew Morton , Linus Torvalds , "Alasdair G. Kergon" Subject: Re: [PATCH 2/7] mm: introduce kvmalloc and kvmalloc_node In-Reply-To: <20150714215413.GP3902@dastard> Message-ID: References: <20150707144117.5b38ac38efda238af8a1f536@linux-foundation.org> <20150708161815.bdff609d77868dbdc2e1ce64@linux-foundation.org> <20150714211918.GC7915@redhat.com> <20150714215413.GP3902@dastard> User-Agent: Alpine 2.10 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2541 Lines: 57 On Wed, 15 Jul 2015, Dave Chinner wrote: > > Sure, but it's not accomplishing the same thing: things like > > ext4_kvmalloc() only want to fallback to vmalloc() when high-order > > allocations fail: the function is used for different sizes. This cannot > > be converted to kvmalloc_node() since it fallsback immediately when > > reclaim fails. Same issue with single_file_open() for the seq_file code. > > We could go through every kmalloc() -> vmalloc() fallback for more > > examples in the code, but those two instances were the first I looked at > > and couldn't be converted to kvmalloc_node() without work. > > > > > It is always easier to shoehorn utility functions locally within a > > > subsystem (be it ext4, dm, etc) but once enough do something in a > > > similar but different way it really should get elevated. > > > > > > > I would argue that > > > > void *ext4_kvmalloc(size_t size, gfp_t flags) > > { > > void *ret; > > > > ret = kmalloc(size, flags | __GFP_NOWARN); > > if (!ret) > > ret = __vmalloc(size, flags, PAGE_KERNEL); > > return ret; > > } > > > > is simple enough that we don't need to convert it to anything. > > Except that it will have problems with GFP_NOFS context when the pte > code inside vmalloc does a GFP_KERNEL allocation. Hence we have > stuff in other subsystems (such as XFS) where we've noticed lockdep > whining about this: > Does anyone have an example of ext4_kvmalloc() having a lockdep violation? Presumably the GFP_NOFS calls to ext4_kvmalloc() will never have size > (1 << (PAGE_SHIFT + PAGE_ALLOC_COSTLY_ORDER)) so that kmalloc() above actually never returns NULL and __vmalloc() only gets used for the ext4_kvmalloc(..., GFP_KERNEL) call. It should be fixed, though, probably in the same way as kmem_zalloc_large() today, but it seems the real fix would be to attack the whole vmalloc() GFP_KERNEL issue that has been talked about several times in the past. Then the existing ext4_kvmalloc() implementation should be fine. Once that's done, we can revisit the idea of a generalized kvmalloc() or kvmalloc_node(), but since the implementation such as above is different from the proposed kvmalloc_node() implementation with respect to high-order allocations, I doubt a generalized form will be helpful. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/