Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S263187AbVBDG0l (ORCPT ); Fri, 4 Feb 2005 01:26:41 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261227AbVBDG0k (ORCPT ); Fri, 4 Feb 2005 01:26:40 -0500 Received: from omx2-ext.sgi.com ([192.48.171.19]:4321 "EHLO omx2.sgi.com") by vger.kernel.org with ESMTP id S263187AbVBDG00 (ORCPT ); Fri, 4 Feb 2005 01:26:26 -0500 Date: Thu, 3 Feb 2005 22:26:07 -0800 (PST) From: Christoph Lameter X-X-Sender: clameter@schroedinger.engr.sgi.com To: Paul Mackerras cc: Rik van Riel , Marcelo Tosatti , David Woodhouse , linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@osdl.org Subject: Re: A scrub daemon (prezeroing) In-Reply-To: <16899.2175.599702.827882@cargo.ozlabs.ibm.com> Message-ID: References: <1106828124.19262.45.camel@hades.cambridge.redhat.com> <20050202153256.GA19615@logos.cnet> <20050202163110.GB23132@logos.cnet> <16898.46622.108835.631425@cargo.ozlabs.ibm.com> <16899.2175.599702.827882@cargo.ozlabs.ibm.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2032 Lines: 40 On Fri, 4 Feb 2005, Paul Mackerras wrote: > The dcbz instruction on the G5 (PPC970) establishes the new cache line > in the L2 cache and doesn't disturb the L1 cache (except to invalidate > the line in the L1 data cache if it is present there). The L2 cache > is 512kB and 8-way set associative (LRU). So zeroing a page is > unlikely to disturb the cache lines that the page fault handler is > using. Then, when the page fault handler returns to the user program, > any cache lines that the program wants to touch are available in 12 > cycles (L2 hit latency) instead of 200 - 300 (memory access latency). If the program does not use these cache lines then you have wasted time in the page fault handler allocating and handling them. That is what prezeroing does for you. > > cpu caches) is extraordinarily fast and the zeroing of large portions of > > memory is so too. That is why the impact of scrubd is negligible since > > its extremely fast. > > But that also disturbs cache lines that may well otherwise be useful. Yes but its a short burst that only occurs very infrequestly and it takes advantage of all the optimizations that modern memory subsystems have for linear accesses. And if hardware exists that can offload that from the cpu then the cpu caches are only minimally affected. > As has my scepticism about pre-zeroing actually providing any benefit > on ppc64. Nevertheless, the only definitive answer is to actually > measure the performance both ways. Of course. The optimization depends on the type of load. If you use a benchmark that writes to all pages in a page then you will see no benefit at all. For a kernel compile you will see a slight benefit. For processing of a sparse matrix (page tables are one example) a significant benefit can be obtained. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/