Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp15322952rwd; Sun, 25 Jun 2023 15:13:57 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6aQcbZ5frTfLfix0CGFlPuZdlZznHeeXiPzSkGPzXhEgmxZowQtP/qKwjh7bdw3AaLn7+p X-Received: by 2002:a17:907:a408:b0:98d:3ae:b683 with SMTP id sg8-20020a170907a40800b0098d03aeb683mr9670593ejc.19.1687731237469; Sun, 25 Jun 2023 15:13:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687731237; cv=none; d=google.com; s=arc-20160816; b=WK6T6HagPzgE7vPrJhNxO80IlxDCKsrSQRnPEKesN0GibLH05kz+N1HvHlYSTTBjMJ NEFZv4LHuicX4lR4cAiL1gBDJNr/YiWQ8XJZe/+oROXx/EwO5IFODVs369v8JnGDl1up Nd8GU9cpUVmxqp2KYQ6/uZ2XoRRrJCB+JpSjr2YFSRdakpOqvQi2d3of9WExaUvgm3ss rnTt/BS7+m8KWzOpDZ8yqfFQFqayp76posxZJtDpVUi5mixPcRYkLsrazojeKNCjOkjv Aj7OEiGGIEOSQWo+PR0DIV8Cmg5/1k3yraymNvISOBJebR5ddAN+FEzi1D92uHfnIQSH Yrlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :mime-version:accept-language:in-reply-to:references:message-id:date :thread-index:thread-topic:subject:cc:to:from; bh=TIuqQRrN3+nb+Y/3TSTEMLh0oiqTCkJftJhd6GdHZ7Y=; fh=8JOJlxOUqEJswdn2bIyNe+kU5MamLqeYln11ZgjH4hQ=; b=iWDAdFFYWYV01bMxHB+y5Tvh/+HcAXj1Q34oBPDq11GmYORVDzx8ur1tsa7qAGZC9F kCqrhz9JBUmNlm4NhLTN+H7RpYtePXdUy4WbEnKVFwmlgaxAg99OcbP29R9v4umV73yh qWeTx+5OxZKu0rmV/8Eh6aTuT/9Dz4wu8f8BazUZz4GCPZyLU3+I6ko4p+7lYkjN+atc s94aewAzsKNc18NocNUj0GQTISWmK9ZCPaE8DCttFwMLWzcDz+osGXQ8psxWJAPvVdG9 XyBvkQJ6JyHZg3QJWY4wDuC+hUiQ3MD4i20FRNo+HN5jwr6WGaRqp1TFagINbsh20+Vv pExA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a2-20020a1709065f8200b009746375dc3esi2260555eju.485.2023.06.25.15.13.32; Sun, 25 Jun 2023 15:13:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229452AbjFYVmR convert rfc822-to-8bit (ORCPT + 99 others); Sun, 25 Jun 2023 17:42:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50356 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229510AbjFYVmP (ORCPT ); Sun, 25 Jun 2023 17:42:15 -0400 Received: from eu-smtp-delivery-151.mimecast.com (eu-smtp-delivery-151.mimecast.com [185.58.85.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B2577BA for ; Sun, 25 Jun 2023 14:42:13 -0700 (PDT) Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) by relay.mimecast.com with ESMTP with both STARTTLS and AUTH (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id uk-mta-264-__5NTeNUOTOunLze3S7Sqw-1; Sun, 25 Jun 2023 22:42:09 +0100 X-MC-Unique: __5NTeNUOTOunLze3S7Sqw-1 Received: from AcuMS.Aculab.com (10.202.163.6) by AcuMS.aculab.com (10.202.163.6) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Sun, 25 Jun 2023 22:42:08 +0100 Received: from AcuMS.Aculab.com ([::1]) by AcuMS.aculab.com ([::1]) with mapi id 15.00.1497.048; Sun, 25 Jun 2023 22:42:08 +0100 From: David Laight To: 'Evan Green' , Palmer Dabbelt CC: Simon Hosie , Albert Ou , Alexandre Ghiti , Andrew Jones , Andy Chiu , Anup Patel , Conor Dooley , Greentime Hu , Guo Ren , "Heiko Stuebner" , Jisheng Zhang , Jonathan Corbet , Ley Foon Tan , Li Zhengyu , "Masahiro Yamada" , Palmer Dabbelt , "Paul Walmsley" , Sia Jee Heng , Sunil V L , Xianting Tian , Yangyu Chen , "linux-doc@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-riscv@lists.infradead.org" Subject: RE: [PATCH 1/2] RISC-V: Probe for unaligned access speed Thread-Topic: [PATCH 1/2] RISC-V: Probe for unaligned access speed Thread-Index: AQHZpiD8K74IXZi1x0Sk5pxi8WFXO6+cDCdw Date: Sun, 25 Jun 2023 21:42:07 +0000 Message-ID: <88d0b61ca01d4c1a87a17d3c34baa2a6@AcuMS.aculab.com> References: <20230623222016.3742145-1-evan@rivosinc.com> <20230623222016.3742145-2-evan@rivosinc.com> In-Reply-To: <20230623222016.3742145-2-evan@rivosinc.com> Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.107] MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT X-Spam-Status: No, score=-2.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Evan Green > Sent: 23 June 2023 23:20 > > Rather than deferring misaligned access speed determinations to a vendor > function, let's probe them and find out how fast they are. If we > determine that a misaligned word access is faster than N byte accesses, > mark the hardware's misaligned access as "fast". > > Fix the documentation as well to reflect this bar. Previously the only > SoC that returned "fast" was the THead C906. The change to the > documentation is more a clarification, since the C906 is fast in the > sense of the corrected documentation. > > Signed-off-by: Evan Green > --- ... > diff --git a/Documentation/riscv/hwprobe.rst b/Documentation/riscv/hwprobe.rst > index 19165ebd82ba..710325751766 100644 > --- a/Documentation/riscv/hwprobe.rst > +++ b/Documentation/riscv/hwprobe.rst > @@ -88,12 +88,12 @@ The following keys are defined: > always extremely slow. > > * :c:macro:`RISCV_HWPROBE_MISALIGNED_SLOW`: Misaligned accesses are supported > - in hardware, but are slower than the cooresponding aligned accesses > - sequences. > + in hardware, but are slower than N byte accesses, where N is the native > + word size. > > * :c:macro:`RISCV_HWPROBE_MISALIGNED_FAST`: Misaligned accesses are supported > - in hardware and are faster than the cooresponding aligned accesses > - sequences. > + in hardware and are faster than N byte accesses, where N is the native > + word size. I think I'd just say 'faster/slower than using byte accesses' (ie no N). There are two obvious FAST cases: 1) the misaligned access takes an extra clock - worth aligning copies. 2) the misaligned access is pretty much as fast as an aligned one. Even if you find it hard to distinguish them you should probably allow for them both. x86 (on Intel (non-atom) cpu) is definitely in the latter camp. Misaligned copies are measurable slower - but not enough to ever worry about. I think that misaligned transfers get spilt into 8 byte accesses (pretty irrelevant in the kernel) and then accesses that cross cache line boundaries are split on the boundary. With pipelined writes and two reads/clock it doesn't often make a measurable difference. That is definitely what I see for uncached accesses to PCIe space. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)