Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751881AbbLUMqk (ORCPT ); Mon, 21 Dec 2015 07:46:40 -0500 Received: from foss.arm.com ([217.140.101.70]:48713 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751447AbbLUMqj (ORCPT ); Mon, 21 Dec 2015 07:46:39 -0500 Date: Mon, 21 Dec 2015 12:46:38 +0000 From: Will Deacon To: Andrew Pinski Cc: pinsia@gmail.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] ARM64: Improve copy_page for 128 cache line sizes. Message-ID: <20151221124637.GN23092@arm.com> References: <1450570278-19404-1-git-send-email-apinski@cavium.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1450570278-19404-1-git-send-email-apinski@cavium.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1067 Lines: 28 On Sat, Dec 19, 2015 at 04:11:18PM -0800, Andrew Pinski wrote: > Adding a check for the cache line size is not much overhead. > Special case 128 byte cache line size. > This improves copy_page by 85% on ThunderX compared to the > original implementation. So this patch seems to: - Align the loop - Increase the prefetch size - Unroll the loop once Do you know where your 85% boost comes from between these? I'd really like to avoid having multiple versions of copy_page, if possible, but maybe we could end up with something that works well enough regardless of cacheline size. Understanding what your bottleneck is would help to lead us in the right direction. Also, how are you measuring the improvement? If you can share your test somewhere, I can see how it affects the other systems I have access to. Will -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/