Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752507AbbBRXC0 (ORCPT ); Wed, 18 Feb 2015 18:02:26 -0500 Received: from resqmta-ch2-08v.sys.comcast.net ([69.252.207.40]:52699 "EHLO resqmta-ch2-08v.sys.comcast.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751646AbbBRXCZ (ORCPT ); Wed, 18 Feb 2015 18:02:25 -0500 Date: Wed, 18 Feb 2015 17:02:23 -0600 (CST) From: Christoph Lameter X-X-Sender: cl@gentwo.org To: Jesper Dangaard Brouer cc: Joonsoo Kim , David Rientjes , akpm@linuxfoundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, penberg@kernel.org, iamjoonsoo@lge.com Subject: Re: [PATCH 1/3] Slab infrastructure for array operations In-Reply-To: <20150218103245.3aa3ca87@redhat.com> Message-ID: References: <20150210194804.288708936@linux.com> <20150210194811.787556326@linux.com> <20150213023534.GA6592@js1304-P5Q-DELUXE> <20150217051541.GA15413@js1304-P5Q-DELUXE> <20150218103245.3aa3ca87@redhat.com> Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1577 Lines: 33 On Wed, 18 Feb 2015, Jesper Dangaard Brouer wrote: > (My use-case is in area of 32-64 elems) Ok that is in the realm of a couple of pages from the page allocator? > > Its not that detailed. It is just layin out the basic strategy for the > > array allocs. First go to the partial lists to decrease fragmentation. > > Then bypass the allocator layers completely and go direct to the page > > allocator if all objects that the page will accomodate can be put into > > the array. Lastly use the cpu hot objects to fill in the leftover (which > > would in any case be less than the objects in a page). > > IMHO this strategy is a bit off, from what I was looking for. > > I would prefer the first elements to be cache hot, and the later/rest of > the elements can be more cache-cold. Reasoning behind this is, > subsystem calling this alloc_array have likely ran out of elems (from > it's local store/prev-call) and need to handout one elem immediately > after this call returns. The problem is that going for the cache hot objects involves dealing with synchronization that you would not have to spend time on if going direct to the page allocator or going to the partial lists and retrieving multiple objects by taking a single lock. Per cpu object (cache hot!) is already optimized to the hilt. There wont be much of a benefit. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/