Received: by 2002:a05:7208:9594:b0:7e:5202:c8b4 with SMTP id gs20csp2127141rbb; Tue, 27 Feb 2024 11:24:24 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCVpa2zSPXMepaYeikpzKKfE5noAhLmvMlLQVQ6iO9Z4Qq9hPKV3v74QZC6vgBGtW/U+AYe63jm30uw7vmOaLBkiPGklPSubChaFRb+mLQ== X-Google-Smtp-Source: AGHT+IEdUXcnSXRBQ1kwjHG25uYxtAFkWj4VbGxpLd0VeJVr9NMaHE8CLzTvWQBL0SMaABjOBA4F X-Received: by 2002:a05:6830:1691:b0:6e4:8c2b:c739 with SMTP id k17-20020a056830169100b006e48c2bc739mr10064518otr.2.1709061864347; Tue, 27 Feb 2024 11:24:24 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709061864; cv=pass; d=google.com; s=arc-20160816; b=hhD8xji+OcH0JoS8CG77CTGNTIH0tOyt2wQFmBXBJBdCkPjJisK2tB6N+tF7Lv3W2X gPrLWf2Z39SSN+1ol9ojuC7SBVXfC9rXqjLJ8xMUd5zvD+jkeeuYqsMtneMLZ1whegas 0cP1c+Xjy7XJd2fCH4ef20tkKibO4vUdBW+L5Iq61koJjNy7/DWlH9OQn5tXhLCbhiw9 hKDbFC7apFTI/LB/o/PzwEKWCA4PdfjVLQ13vMppzqzqPKojU4SfwTIScETHi912B/lG kTtDTmFbBBOvMdvR11N+xbB8+xsEHEHWnk5JnuAU3XELySZ83WIst819PoGfc+9a/RzB vwEQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=LjU6LlOMoMfkxYT5TjTMLLN7YZEywYOA9lg639yTU1E=; fh=U5FpUjjfMQHzYlpSkU4p5ah7siGAFuCOsMxCAroNh9c=; b=jXgDwTz5XCMsyTb3iuipzMw2wcM3QGOIQ2D6e4MvxUuZp9STaDjMM7BrvDHmRM78UT mUoHTprLQFDbOfkTz4gpJ2+coICFULTy3UpC/FRj49blyp6OOiQ1J+yzPka9+6qFa3Q9 m+p2OuJzUH12LJuSkChrDO0Du+KBwplg/fOkNMK912/CeKzn5nimmnTxO1Rqj4eD4fSj 4IwgsqBhTBcPSgjsOpBK8u2n2u8mxsY5UNdrJxyy76aGaBefTqb8M3I9ESOzjdtFWyMU PDbZOPAjDkeVEJot+IauFQm8i9z1XBrOcWXjviElDPDSDB3xhLvoejx+UeKisbfTidfp WRUg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b="TnNZf/fV"; arc=pass (i=1 spf=pass spfdomain=rivosinc.com dkim=pass dkdomain=rivosinc-com.20230601.gappssmtp.com); spf=pass (google.com: domain of linux-kernel+bounces-83913-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-83913-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id e1-20020a636901000000b005cee178fe54si5898207pgc.222.2024.02.27.11.24.24 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Feb 2024 11:24:24 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-83913-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b="TnNZf/fV"; arc=pass (i=1 spf=pass spfdomain=rivosinc.com dkim=pass dkdomain=rivosinc-com.20230601.gappssmtp.com); spf=pass (google.com: domain of linux-kernel+bounces-83913-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-83913-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id B68C4284107 for ; Tue, 27 Feb 2024 19:20:18 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B5245524B2; Tue, 27 Feb 2024 19:20:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=rivosinc-com.20230601.gappssmtp.com header.i=@rivosinc-com.20230601.gappssmtp.com header.b="TnNZf/fV" Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C392E51C45 for ; Tue, 27 Feb 2024 19:20:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.170 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709061607; cv=none; b=a5unOyFJWKyBvMRAjzQCumtb7hMrA/leNtnXHMn/FI7aWAYRs1vBC625qAn3GIVAb00zJLWJOD2Dv3wYYGaa+OrtUCqGwLWlsWn+MHvSJG5D+rFGQR/qkDj4i9ekAGKkXeTyKRE2f16ghpjsHuXCFZSzF81xUkQ3DiB834Xvmo8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709061607; c=relaxed/simple; bh=yEsKdJX+K7BkS3jD4MYdvy9KUq9/S1FUTqeOm3Nqg/U=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=KqkerkYP5fzXrYKofkynjDdibJ48nd+O3+wSSilb3e+wIoB9OB7FQZTIyoPfEDuQmFZ6vj2lNFZU7yqpYiMGUFYKOaYz+KcQ1VqSOU6XAyQwKFB3FumBK2Al8qQEYccmds0ryHqUFWFeLY+TnyL2Kk0/ijOnF1xShuLSIPFpREU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=rivosinc.com; spf=pass smtp.mailfrom=rivosinc.com; dkim=pass (2048-bit key) header.d=rivosinc-com.20230601.gappssmtp.com header.i=@rivosinc-com.20230601.gappssmtp.com header.b=TnNZf/fV; arc=none smtp.client-ip=209.85.210.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-pf1-f170.google.com with SMTP id d2e1a72fcca58-6e55101cc56so607897b3a.2 for ; Tue, 27 Feb 2024 11:20:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1709061605; x=1709666405; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=LjU6LlOMoMfkxYT5TjTMLLN7YZEywYOA9lg639yTU1E=; b=TnNZf/fVeY9ZJEmpf98GCXalF6EdoHBNHJsMiO5FKBhCTVvVWwiye5naF5JyRHlaUE 4JDEGmeQE4aN+sO1XqD8qoll8FWNsKLOmhclp43xUKq2CCDyG4LgeuQJ9m0yfvubPyPz +wGSK2x5nNv8EWnQQqKY7IS/DC01vn5y2wbPQPFYkR0R/wD6B0AqMY3Tk+xdKjhei1co uN0tKgiEvdQ/mLGGN21MtFoPvQbnFTZUO6GqfOLMrser7rkFhezLvN2CY55DoLwQW7Ty uj+TVzlawoS0xeqnZR2HAvt100d/dq0CjL5SCTsiY59o8L+X+XeCUj4zYuYXXKElhv+H s+zQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709061605; x=1709666405; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=LjU6LlOMoMfkxYT5TjTMLLN7YZEywYOA9lg639yTU1E=; b=S9BVMbq9un6EX4kM86uUB+21lhgBkCn2nEz88LyVPazA1szJBub0N5Cyb6gU7HrgJ7 qIszlqibhfMBh/PqbY8wijjpvTIHUWSAedlqgcDyvHnUUp5Td3/BaTaYpSWLfKeliZsq 0VI0g36Y3tx8HOn0rtbSMwEbGOHm+wJLxfFSE96KTjK4MyRYRV665mZGLOkolo+uQSaP 4n5h2Vyt744JtigHxpNGn8coOrrZ1jJvKeqcqrJQVjNrPFqHX731jL7TJ7Hg0I9FwMUO Oa43i76JbjmJWvbkJMWXXeui/okGYDzBlhC6iemcWIcCpJ/BToItxwzo3KJyddT5gQoz A1bw== X-Forwarded-Encrypted: i=1; AJvYcCX5ZEo2RwRhdVeLE27GDfHsd465P/Ld7FML3J2zaNtyugvEWd5CY4QmjFbLX5xZ6BUskUs3vxfSgdT01KkzGRun0kXIJzwfwj8v6e0P X-Gm-Message-State: AOJu0YzB+8h9sHno5wb1OD5E0mG24Hz6yPcsACDMHuA/RRGuvzsP5Vwf C3dO5ZBbTc6YFQNxJbZ2PZJY/tFGKSQNGp7yg+AYS4OiZldVaf2ncEDJQevNY1I= X-Received: by 2002:a05:6a20:d706:b0:1a0:f7d0:e652 with SMTP id iz6-20020a056a20d70600b001a0f7d0e652mr3323640pzb.49.1709061605039; Tue, 27 Feb 2024 11:20:05 -0800 (PST) Received: from ghost ([50.213.54.97]) by smtp.gmail.com with ESMTPSA id k25-20020a635619000000b005bdbe9a597fsm6128331pgb.57.2024.02.27.11.20.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Feb 2024 11:20:04 -0800 (PST) Date: Tue, 27 Feb 2024 11:20:02 -0800 From: Charlie Jenkins To: Conor Dooley Cc: Conor Dooley , Albert Ou , linux-kernel@vger.kernel.org, Eric Biggers , Evan Green , Palmer Dabbelt , Jisheng Zhang , Paul Walmsley , =?iso-8859-1?Q?Cl=E9ment_L=E9ger?= , linux-riscv@lists.infradead.org, Charles Lohr Subject: Re: [PATCH v4 2/2] riscv: Set unalignment speed at compile time Message-ID: References: <20240216-disable_misaligned_probe_config-v4-0-dc01e581c0ac@rivosinc.com> <20240216-disable_misaligned_probe_config-v4-2-dc01e581c0ac@rivosinc.com> <20240227-condone-impeach-9469dffc6b47@wendy> <20240227-citable-scanning-1fd48c96b758@spud> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240227-citable-scanning-1fd48c96b758@spud> On Tue, Feb 27, 2024 at 06:48:54PM +0000, Conor Dooley wrote: > On Tue, Feb 27, 2024 at 10:17:21AM -0800, Charlie Jenkins wrote: > > On Tue, Feb 27, 2024 at 11:39:25AM +0000, Conor Dooley wrote: > > > On Fri, Feb 16, 2024 at 12:33:19PM -0800, Charlie Jenkins wrote: > > > > > +config RISCV_EMULATED_UNALIGNED_ACCESS > > > > + bool "Assume the CPU expects emulated unaligned memory accesses" > > > > + depends on NONPORTABLE > > > > > > This is portable too, right? > > > > I guess so? I think I would prefer to have the probing being the only > > portable option. > > I dunno, I think there could be value to someone in always emulating > this in the kernel and I don't think that should relegate them to the > naughty step, given it can work everywhere. Alright, I will remove the nonportable. > > > > > > +config RISCV_SLOW_UNALIGNED_ACCESS > > > > + bool "Assume the CPU supports slow unaligned memory accesses" > > > > + depends on NONPORTABLE > > > > + help > > > > + Assume that the CPU supports slow unaligned memory accesses. When > > > > + enabled, this option improves the performance of the kernel on such > > > > + CPUs. > > > > > > Does it? Are you sure that generating unaligned accesses on systems > > > where they are slow is a performance increase? > > > That said, I don't really see this option actually doing anything other > > > than setting the value for hwprobe, so I don't actually know what the > > > effect of this option actually is on the kernel's performance. > > > > > > Generally I would like to suggest a change from "CPU" to "system" here, > > > since the slow cases that exist are mostly because the unaligned access > > > is actually emulated in firmware. > > > > It would be ideal if "emulated" was used for any case of emulated > > accesses (firmware or in the kernel). Doing emulated accesses will be > > orders of magnitude slower than a processor that "slowly" handles the > > accesses. > > > > So even if the processor performs a "slow" access, it could still be > > beneficial for the kernel to do the misaligned access rather than manual > > do the alignment. > > Right. But, at least from a probing perspective, SLOW is what gets > selected when firmware emulates the unaligned access so to userspace > seeing slow means that the performance could be horrifically bad: > > | rzfive: > | cpu0: Ratio of byte access time to unaligned word access is > | 1.05, unaligned accesses are fast > | > | icicle: > | > | cpu1: Ratio of byte access time to unaligned word access is > | 0.00, unaligned accesses are slow > | cpu2: Ratio of byte access time to unaligned word access is > | 0.00, unaligned accesses are slow > | cpu3: Ratio of byte access time to unaligned word access is > | 0.00, unaligned accesses are slow > | > | cpu0: Ratio of byte access time to unaligned word access is > | 0.00, unaligned accesses are slow > | > | k210: > | > | cpu1: Ratio of byte access time to unaligned word access is > | 0.02, unaligned accesses are slow > | cpu0: Ratio of byte access time to unaligned word access is > | 0.02, unaligned accesses are slow > | > | starlight: > | > | cpu1: Ratio of byte access time to unaligned word access is > | 0.01, unaligned accesses are slow > | cpu0: Ratio of byte access time to unaligned word access is > | 0.02, unaligned accesses are slow > | > | vexriscv/orangecrab: > | > | cpu0: Ratio of byte access time to unaligned word access is > | 0.00, unaligned accesses are slow > https://lore.kernel.org/all/CAMuHMdVtXGjP8VFMiv-7OMFz1XvfU1cz=Fw4jL3fcp4wO1etzQ@mail.gmail.com/ If the accesses are horrifically slow then maybe they should be flagged as emulated rather than slow by the probe. > > > Currently there is no place that takes into account this "slow" option > > but I wanted to leave it open for future optimizations. > > I don't think you can really do much optimisation if you detect that it > is slow, and since this option is analogous to detecting slow I dunno if > you can do anything here either? This option really just seems to me to > mean "don't do any optimisations for unaligned being fast but also don't > emulate it in the kernel". I am fine with that being the meaning of this option. However, on a system that has misaligned accesses that are twice as slow as correctly aligned accesses, the misaligned accesses would reasonably be selected as "slow". However, something like the checksum functions would still probably want to do the misaligned accesses because performing the alignment would be even slower. This is all hypothetical and not a "real" use case so maybe I am optimizing where no optimization is needed. > > > > > However, the kernel will run much more slowly, or will not be > > > > + able to run at all, on CPUs that do not support unaligned memory > > > > + accesses. > > > > + > > > > config RISCV_EFFICIENT_UNALIGNED_ACCESS > > > > bool "Assume the CPU supports fast unaligned memory accesses" > > > > depends on NONPORTABLE > > > > select DCACHE_WORD_ACCESS if MMU > > > > select HAVE_EFFICIENT_UNALIGNED_ACCESS > > > > help > > > > - Say Y here if you want the kernel to assume that the CPU supports > > > > - efficient unaligned memory accesses. When enabled, this option > > > > - improves the performance of the kernel on such CPUs. However, the > > > > - kernel will run much more slowly, or will not be able to run at all, > > > > - on CPUs that do not support efficient unaligned memory accesses. > > > > + Assume that the CPU supports fast unaligned memory accesses. When > > > > + enabled, this option improves the performance of the kernel on such > > > > + CPUs. However, the kernel will run much more slowly, or will not be > > > > + able to run at all, on CPUs that do not support efficient unaligned > > > > + memory accesses. > > > > + > > > > +config RISCV_UNSUPPORTED_UNALIGNED_ACCESS > > > > > > This option needs to be removed. The uabi states that unaligned access > > > is supported in userspace, if the cpu or firmware does not implement > > > unaligned access then the kernel must emulate it. > > > > Should it removed from hwprobe as well then? > > No, I actually suggested that it be documented etc. Originally > "UNSUPPORTED" was "UNKNOWN" and nothing more than the default value but > I suggested that it be documented since that would allow a system that > did not have the same uabi problem to use all the same defines. > Technically it is possible for unaligned access to be unsupported, if > you have a kernel that does not have the emulator but does have the > hwprobe stuff supported. I think there was about a 6 month period where > this was the case, so combine that with firmware that does not do the > emulation and unaligned accesses are unsupported. Sounds great, will remove :) - Charlie > > Cheers, > Conor.