Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753354AbZA3QQe (ORCPT ); Fri, 30 Jan 2009 11:16:34 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751896AbZA3QQ1 (ORCPT ); Fri, 30 Jan 2009 11:16:27 -0500 Received: from casper.infradead.org ([85.118.1.10]:36943 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751368AbZA3QQ1 (ORCPT ); Fri, 30 Jan 2009 11:16:27 -0500 Subject: Re: [PATCH -v3] use per cpu data for single cpu ipi calls From: Peter Zijlstra To: Linus Torvalds Cc: Jens Axboe , Steven Rostedt , Andrew Morton , LKML , Rusty Russell , npiggin@suse.de, Ingo Molnar , Thomas Gleixner , Arjan van de Ven In-Reply-To: References: <20090128173039.cbc29e81.akpm@linux-foundation.org> <1233218954.7835.11.camel@twins> <1233253380.4495.123.camel@laptop> <1233254680.4495.126.camel@laptop> <20090130112310.GI30821@kernel.dk> <1233318733.4495.174.camel@laptop> Content-Type: text/plain Date: Fri, 30 Jan 2009 17:16:10 +0100 Message-Id: <1233332170.4495.200.camel@laptop> Mime-Version: 1.0 X-Mailer: Evolution 2.24.3 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1663 Lines: 38 On Fri, 2009-01-30 at 08:04 -0800, Linus Torvalds wrote: > My only question is whetherr we might even drop the kmalloc() some day: > I suspect that the CSD_FLAG_LOCK is essentially never a contention point, > and the cost (and occasional synchronization) of kmalloc() quite possibly > overwhelms any theoretical scaling ability. IIRC the recent SL*B numbers posted showed that a kmalloc could be as cheap as ~100 cycles or something. IPIs are sadly still a bit more expensive. > If another CPU hasn't even received its IPI before the same CPU sends the > next one, I'm not sure we _want_ to send one, in fact. I think the intent was to re-route IO-completion interrupts to whatever cpu/node issued the IO with the idea that that cpu/node has the page hottest etc. and transferring the completion is cheaper than bouncing the page. Since that would be relaying hardware interrupts, there's nothing much you can do about the rate, or something, that's up to the firmware on $$$ scsi thing. But Jens already said that that path was using the __ variant and providing its own csds, the kmalloc isn't needed there, so it might all be moot. > But that's a secondary issue, and isn't a correctness thing, just a "do we > really need three different allocations?" musing.. Nick, Jens, I was under the presumption that the kmalloc was needed for something other than failing to deadlock, happen to remember what? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/