Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp1725553pxj; Sun, 16 May 2021 02:10:06 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxCV94+lFi805ZbkVrxhjHNjCOoyRxebSRP4dLKCP0w7LRcytfizXdGtaMc9eJOxh90FEAD X-Received: by 2002:a17:906:9257:: with SMTP id c23mr19883477ejx.392.1621156206252; Sun, 16 May 2021 02:10:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1621156206; cv=none; d=google.com; s=arc-20160816; b=E8S80+mAhy5RoZSW/qwUqANoRCrJD6KBrrP6cUqrX2QqDLpXV2i7qfRI6cnvccMnbN bOtwtjn9Fa6tJ45y0NiCBzYzpJF84YDxlG4TroKIAuTV1nQJjEwGz9dLj1pT8SIQ0DAw 0+OiKodPjzvvYmPWnEeDBMT7cmprT7uZ/Pyy+7vI7eWGeLdJRYV0WAH6igtXptkcXSuK DxX8IwIVtlAvm3lIWtdi2hoTHesKZ2d02qPhkHBxAzxxIxRM5hh5g2zDacUbVkBX5MLj l3KTiPRyqpMEXzf36Q+tFDTWdk+m/076aL4JuWWG11yPaN6drN9Pu5qyQHVQAnqZsLgV ycHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=9JNlgT3CZfPfkMugaoe08+txYEgEszvVPJG3yrmBL3M=; b=IWzUhVu83UBcJSIQ/sBi0MopSakwMvym2uIaWTjh6t58onrJMNiO7Qda4b9EsQEzQk r7+degm4Ef/glu1aBS0acWgWqzBUZazZLfl42Wk8qwecXurY8t8gL0fzy6WWd2VI6a9n zzIMujWukFJ8MkRV4jtbKcANoRRnGMyOmQZXt4bvxDbTy0pP3WQrtAoKKH3yLlqe8UnC Qckm6MrSz9t/e5Peh28w1+c7v39wKl3bea2LKAorWrojpVpsJct6H0U/ebvWlOOSQX0A +jLIpCUhoZ3odBukvYvx+v1nInZ9TRMS7x6/2eGcnbt2kvUFd/XurQQBxk/eNJHfPxlm tEkg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=i0OdoOqq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id k9si11645530edv.428.2021.05.16.02.09.43; Sun, 16 May 2021 02:10:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=i0OdoOqq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234797AbhEOUMr (ORCPT + 99 others); Sat, 15 May 2021 16:12:47 -0400 Received: from mail.kernel.org ([198.145.29.99]:38240 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231280AbhEOUMm (ORCPT ); Sat, 15 May 2021 16:12:42 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 0D9026121E; Sat, 15 May 2021 20:11:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1621109489; bh=KGVNwawW8npQ64DMYLeLHsXJrejNvMH6jfAnNO6Flaw=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=i0OdoOqqwigdqqTYDpx3l3apcdLe23+2p7LaZMybDEqnZZ2ifXQplJDsw9pcGHsZC SX2a9lhCIInl1FPm9TkC1mRi3rg+JxtSj5pwtIFHsF3AKNtODVDY1TZBLOC1+IDaNi xOtPtLeN68lACsd8QQnvGzcsRVUmNP/Bc3z+fCFgUuzAiwK/UYGq/CR6qDLyZvNzfb PApSv6aH9oQMXVAcRmPq47qmXgM6h3Am/6EhjwNRvUQUGF7TlwDdZdiZx+NQy561NZ Chd2DlBi7PG/IdF4i9IICs+YvJtrD6WVtdk6Z/a/8Az1MArP0VSpM5PYX5hvyqQ9qj rseuYpaHMJc8w== Received: by mail-wr1-f54.google.com with SMTP id d11so2402320wrw.8; Sat, 15 May 2021 13:11:28 -0700 (PDT) X-Gm-Message-State: AOAM5337Yg2e/oN01jP3cD1pUzAHzSoM9OXpETFHJp6zqb8Caze88DyO NWC+W8jy7frMlClRbweMq320Fv6merGHcY1s6YA= X-Received: by 2002:adf:fe04:: with SMTP id n4mr9290331wrr.361.1621109487649; Sat, 15 May 2021 13:11:27 -0700 (PDT) MIME-Version: 1.0 References: <20210514100106.3404011-1-arnd@kernel.org> <20210514100106.3404011-4-arnd@kernel.org> <3d70eb2a-2969-197e-63e8-f3e0a6a8ddd8@physik.fu-berlin.de> <71b5d15d-7bd2-aa08-cc0a-3caccf9c66c8@physik.fu-berlin.de> In-Reply-To: <71b5d15d-7bd2-aa08-cc0a-3caccf9c66c8@physik.fu-berlin.de> From: Arnd Bergmann Date: Sat, 15 May 2021 22:10:23 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v2 03/13] sh: remove unaligned access for sh4a To: John Paul Adrian Glaubitz Cc: linux-arch , Linus Torvalds , Vineet Gupta , Yoshinori Sato , Rich Felker , Linux-sh list , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, May 15, 2021 at 5:36 PM John Paul Adrian Glaubitz wrote: > On 5/14/21 2:22 PM, Arnd Bergmann wrote: > >> My Renesas SH4-Boards actually run an sh4a-Kernel, not an sh4-Kernel: > >> > >> root@tirpitz:~> uname -a > >> Linux tirpitz 5.11.0-rc4-00012-g10c03c5bf422 #161 PREEMPT Mon Jan 18 21:10:17 CET 2021 sh4a GNU/Linux > >> root@tirpitz:~> > >> > >> So, if this change reduces performance on sh4a, I would rather not merge it. > > > > It only makes a difference in very specific scenarios in which unaligned > > accesses are done in a fast path, e.g. when forwarding network packet > > at a high rate on a big-endian kernel (little-endian kernels wouldn't run into > > this on IP headers). If you have a use case for this machine on which the > > you can show a performance regression, I can add a patch on top to put > > the optimized sh4a get_unaligned_le32() back. Dropping this patch > > altogether would make the series much more complex because most of > > the associated code gets removed in the end. > > Hmm, okay. But why does code which sits below arch/sh have to be removed anyway? > > I don't fully understand why it poses any maintenance burden/ What I'm removing is the part that lets architectures override the generic version. > > As I mentioned, supporting "movua" in the compiler likely has a much > > larger impact on performance, as it would also help in user space, and > > it should improve the networking case on little-endian kernels by replacing > > the four separate byte loads/shift pairs with a movua plus a byteswap. > > The problem is that - at least in Debian - we use the sh4 baseline while the kernel > supports both sh4 and sh4a, so we can't use any of these instructions in userland at > the moment. I tried building an sh7785lcr_defconfig with and without the patch, and found that the only affected files are: - in-kernel nfs client - crc32c/sha1/sha256 hash functions - device probing for libata, scsi-core, scsi-disk, hid, r8168 (should not matter after boot) - msdos partition parsing Any nfs client performance difference is probably not even measurable even at gigabit ethernet speed. I see that the hash functions are notably different, but I don't know if the output from the new generic code is actually better or worse than the original. If you do think this is important, please try the version from https://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic.git unaligned-sh4a against the version without the last change in that series. If you can find a relevant test case that exercises it, you may want to add a custom implementation of the hash functions as well. Arnd