Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935035AbcCPKHr (ORCPT ); Wed, 16 Mar 2016 06:07:47 -0400 Received: from foss.arm.com ([217.140.101.70]:40930 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966017AbcCPKHo (ORCPT ); Wed, 16 Mar 2016 06:07:44 -0400 Date: Wed, 16 Mar 2016 10:07:59 +0000 From: Will Deacon To: Ganesh Mahendran Cc: catalin.marinas@arm.com, stable@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, tchalamarla@cavium.com, rrichter@cavium.com, apinski@cavium.com, timur@codeaurora.org Subject: Re: [PATCH] Revert "arm64: Increase the max granular size" Message-ID: <20160316100759.GA18387@arm.com> References: <1458120743-12145-1-git-send-email-opensource.ganesh@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1458120743-12145-1-git-send-email-opensource.ganesh@gmail.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1955 Lines: 50 [adding Cavium folk and Timur] On Wed, Mar 16, 2016 at 05:32:23PM +0800, Ganesh Mahendran wrote: > Reverts commit 97303480753e ("arm64: Increase the max granular size"). > > The commit 97303480753e ("arm64: Increase the max granular size") will > degrade system performente in some cpus. > > We test wifi network throughput with iperf on Qualcomm msm8996 CPU: > ---------------- > run on host: > # iperf -s > run on device: > # iperf -c -t 100 -i 1 > ---------------- > > Test result: > ---------------- > with commit 97303480753e ("arm64: Increase the max granular size"): > 172MBits/sec > > without commit 97303480753e ("arm64: Increase the max granular size"): > 230MBits/sec > ---------------- > > Some module like slab/net will use the L1_CACHE_SHIFT, so if we do not > set the parameter correctly, it may affect the system performance. > > So revert the commit. Unfortunately, the original patch is required to support the 128-byte L1 cache lines of Cavium ThunderX, so we can't simply revert it like this. Similarly, the desire for a single, multiplatform kernel image prevents us from reasonably fixing this at compile time to anything other than the expected maximum value. Furthermore, Timur previously said that the change is also required "on our [Qualcomm] silicon", but I'm not sure if this is msm9886 or not: http://lkml.kernel.org/r/CAOZdJXUiRMAguDV+HEJqPg57MyBNqEcTyaH+ya=U93NHb-pdJA@mail.gmail.com You could look into making ARCH_DMA_MINALIGN a runtime value, but that looks like an uphill struggle to me. Alternatively, we could only warn if the CWG is bigger than L1_CACHE_BYTES *and* we have a non-coherent DMA master, but that doesn't solve any performance issues from having things like locks sharing cachelines, not that I think we ever got any data on that (afaik, we don't pad locks to cacheline boundaries anyway). I'm also not sure what it would mean for PCI NoSnoop transactions. Will