Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754331AbbBKVna (ORCPT ); Wed, 11 Feb 2015 16:43:30 -0500 Received: from mx1.redhat.com ([209.132.183.28]:42531 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753004AbbBKVn3 (ORCPT ); Wed, 11 Feb 2015 16:43:29 -0500 Date: Thu, 12 Feb 2015 10:43:16 +1300 From: Jesper Dangaard Brouer To: Christoph Lameter Cc: akpm@linuxfoundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, penberg@kernel.org, iamjoonsoo@lge.com, brouer@redhat.com Subject: Re: [PATCH 2/3] slub: Support for array operations Message-ID: <20150212104316.2d5c32ea@redhat.com> In-Reply-To: References: <20150210194804.288708936@linux.com> <20150210194811.902155759@linux.com> <20150211174817.44cc5562@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2271 Lines: 64 On Wed, 11 Feb 2015 13:07:24 -0600 (CST) Christoph Lameter wrote: > On Wed, 11 Feb 2015, Jesper Dangaard Brouer wrote: > > > > + > > > + > > > + spin_lock_irqsave(&n->list_lock, flags); > > > > This is quite an expensive lock with irqsave. > > Yes but we take it for all partial pages. Sure, that is good, but this might be a contention point. In a micro benchmark, this contention should be visible, but in real use-cases the given subsystem also need to spend time to use these elements before requesting a new batch (as long as NIC cleanup cycles don't get too synchronized) > > Yet another lock cost. > > Yup the page access is shared but there is one per page. Contention is > unlikely. Yes, contention is unlikely, but every atomic operation is expensive. On my system the measured cost is 8ns, and a lock/unlock does two, thus 16ns. Which we then do per page freelist. > > > + spin_unlock_irqrestore(&n->list_lock, flags); > > > + return allocated; > > > > I estimate (on my CPU) the locking cost itself is more than 32ns, plus > > the irqsave (which I've also found quite expensive, alone 14ns). Thus, > > estimated 46ns. Single elem slub fast path cost is 18-19ns. Thus 3-4 > > elem bulking should be enough to amortized the cost, guess we are still > > good :-) > > We can require that interrupt are off when the functions are called. Then > we can avoid the "save" part? Yes, we could also do so with an "_irqoff" variant of the func call, but given we are defining the API we can just require this from the start. I plan to use this in softirq, where I know interrupts are on, but I can use the less-expensive "non-save" variant local_irq_{disable,enable}. Measurements show (x86_64 E5-2695): * 2.860 ns cost for local_irq_{disable,enable} * 14.840 ns cost for local_irq_save()+local_irq_restore() -- Best regards, Jesper Dangaard Brouer MSc.CS, Sr. Network Kernel Developer at Red Hat Author of http://www.iptv-analyzer.org LinkedIn: http://www.linkedin.com/in/brouer -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/