Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756097AbYAJLM6 (ORCPT ); Thu, 10 Jan 2008 06:12:58 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753382AbYAJLMv (ORCPT ); Thu, 10 Jan 2008 06:12:51 -0500 Received: from ns1.suse.de ([195.135.220.2]:58510 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753373AbYAJLMu (ORCPT ); Thu, 10 Jan 2008 06:12:50 -0500 Date: Thu, 10 Jan 2008 12:12:48 +0100 From: Andi Kleen To: Ingo Molnar Cc: Andi Kleen , linux-kernel@vger.kernel.org, Thomas Gleixner , "H. Peter Anvin" , Venki Pallipadi , suresh.b.siddha@intel.com, Arjan van de Ven , Dave Jones Subject: Re: CPA patchset Message-ID: <20080110111248.GR25945@bingen.suse.de> References: <20080103424.989432000@suse.de> <20080110093126.GA360@elte.hu> <20080110095337.GK25945@bingen.suse.de> <20080110100443.GB28209@elte.hu> <20080110100712.GO25945@bingen.suse.de> <20080110105726.GD28209@elte.hu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080110105726.GD28209@elte.hu> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2156 Lines: 52 On Thu, Jan 10, 2008 at 11:57:26AM +0100, Ingo Molnar wrote: > > > > > > WBINVD isnt particular fast (takes a few msecs), but why is > > > > > that a problem? Drivers dont do high-frequency ioremap-ing. > > > > > It's typically only done at driver/device startup and that's > > > > > it. > > > > > > > > Actually graphics drivers can do higher frequency allocation of WC > > > > memory (with PAT) support. > > > > > > but that's not too smart: why dont they use WB plus cflush instead? > > > > Because they need to access it WC for performance. > > I think you have it fundamentally backwards: the best for performance is > WB + cflush. What would WC offer for performance that cflush cannot do? Cached requires the cache line to be read first before you can write it. WC on the other hand does not allocate a cache line and just dumps the data into a special write combining buffer. It was invented originally because reads from AGP were incredibly slow. And it's race less regarding the caching protocol (assuming you flush the caches and TLBs correctly). Another typical problem is that if something is uncached then you can't have it in any other caches because if that cache eventually flushes it will corrupt the data. That can happen with remapping apertures for example which remap data behind the CPUs back. CLFLUSH is really only a hint but it cannot be used if UC is needed for correctness. > also, it's irrelevant to change_page_attr() call frequency. Just map in > everything from the card and use it. In graphics, if you remap anything > on the fly and it's not a slowpath you've lost the performance game even > before you began it. The typical case would be lots of user space DRI clients supplying their own buffers on the fly. There's not really a fixed pool in this case, but it all varies dynamically. In some scenarios that could happen quite often. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/