Subject: Re: [PATCH 3 of 3] Add __raw_memcpy_toio32 to each arch
From: "Bryan O'Sullivan" <bos@pathscale.com>
To: Andi Kleen <ak@suse.de>
Cc: akpm@osdl.org, linux-kernel@vger.kernel.org, hch@infradead.org,
       rdreier@cisco.com
In-Reply-To: <200601102109.00067.ak@suse.de>
References: <5673a186625f62491f33.1136922839@serpentine.internal.keyresearch.com>
	 <200601102109.00067.ak@suse.de>
Content-Type: text/plain
Organization: PathScale, Inc.
Date: Tue, 10 Jan 2006 14:52:49 -0800
Message-Id: <1136933569.6294.40.camel@serpentine.pathscale.com>
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 703
Lines: 19

On Tue, 2006-01-10 at 21:08 +0100, Andi Kleen wrote:
> On Tuesday 10 January 2006 20:53, Bryan O'Sullivan wrote:
> > Most arches use the generic routine.  x86_64 uses memcpy32 instead;
> > this is substantially faster, even over a bus that is much slower than
> > the CPU.
> 
> So did you run numbers against the C implementation with -funroll-loops ? 
> What were the results?

The C implementation is about 5% slower when copying over
HyperTransport.

	<b

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/