Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757879Ab3FMTXn (ORCPT ); Thu, 13 Jun 2013 15:23:43 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:37301 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754860Ab3FMTXl (ORCPT ); Thu, 13 Jun 2013 15:23:41 -0400 Date: Thu, 13 Jun 2013 12:23:39 -0700 From: Andrew Morton To: Tejun Heo Cc: Kent Overstreet , linux-kernel@vger.kernel.org, Oleg Nesterov , Christoph Lameter , Ingo Molnar , Andi Kleen , Jens Axboe , "Nicholas A. Bellinger" , Jeff Layton , "J. Bruce Fields" Subject: Re: [PATCH] Percpu tag allocator Message-Id: <20130613122339.239a721d097a64435817a780@linux-foundation.org> In-Reply-To: <20130613191507.GB13970@mtj.dyndns.org> References: <1371009804-11596-1-git-send-email-koverstreet@google.com> <20130612163854.91da28042ab7a943b69a5970@linux-foundation.org> <20130613020536.GA10979@localhost> <20130612200311.7f9d938a.akpm@linux-foundation.org> <20130613185318.GB12075@mtj.dyndns.org> <20130613120439.fe56d178a1143089136fdacc@linux-foundation.org> <20130613191507.GB13970@mtj.dyndns.org> X-Mailer: Sylpheed 3.2.0beta5 (GTK+ 2.24.10; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2801 Lines: 63 On Thu, 13 Jun 2013 12:15:07 -0700 Tejun Heo wrote: > Hello, Andrew. > > On Thu, Jun 13, 2013 at 12:04:39PM -0700, Andrew Morton wrote: > > > The thing is that id[r|a] guarantee that the lowest available slot is > > > allocated > > > > That isn't the case for ida_get_new_above() - the caller gets to > > control the starting index. > > Hmmm? get_new_above() is the same, it must allocate the first > available ID above the given low bound - used to exclude unused or > reserved IDs. Right. So using different starting IDs for different CPUs can be used to improve scalability. > > The worst outcome here is that idr.c remains unimproved and we merge a > > new allocator which does basically the same thing. > > The lowest number guarantee makes them different. Maybe tag > allocation can be layered on top as a caching layer, I don't know, but > at any rate we need at least two different operation modes. Why? Tag allocation doesn't care about the values - just that they be unique. > > The best outcome is that idr.c gets improved and we don't have to merge > > duplicative code. > > > > So please, let's put aside the shiny new thing for now and work out how > > we can use the existing tag allocator for these applications. If we > > make a genuine effort to do this and decide that it's fundamentally > > hopeless then this is the time to start looking at new implementations. > > > > (I can think of at least two ways of making ida_get_new_above() an > > order of magnitude faster for this application and I'm sure you guys > > can as well.) > > Oh, I'm sure the current id[r|a] can be improved upon a lot but I'm > very skeptical one can reach the level of scalability necessary for, > say, pci-e attached extremely high-iops devices while still keeping > the lowest number allocation, which can't be achieved without strong > synchronization on each alloc/free. > > Maybe we can layer things so that we have percpu layer on top of > id[r|a] and, say, mapping id to point is still done by idr, or the > percpu tag allocator uses ida for tag chunk allocations, but it's > still gonna be something extra on top. It's not obvious that explicit per-cpu is needed. Get an ID from ida_get_new_above(), multiply it by 16 and store that in device-local storage, along with a 16-bit bitmap. Blam, 30 lines of code and the ida_get_new_above() cost is reduced 16x and it's off the map. Or perhaps you can think of something smarter, but first you have to start thinking of solutions rather than trying to find problems :( -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/