Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp156424pxb; Mon, 31 Jan 2022 18:05:49 -0800 (PST) X-Google-Smtp-Source: ABdhPJyFE6QK7BKPW2OxEP3PXG52ULYu8XXtY+U9Jrzi60hsO78HSY09shOVNvuTFLTlSEf2FZad X-Received: by 2002:aa7:c3cd:: with SMTP id l13mr23058345edr.93.1643681149661; Mon, 31 Jan 2022 18:05:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643681149; cv=none; d=google.com; s=arc-20160816; b=rrZg0hh90MzNT1sHCrhmjE7HGZua92Waw41kht3eZQBHl3uRu8psseFEYhvTG58xxb D0LQvstNoAmjFoy/D+GTheUa5HbuSnSNA8On0DIwu7K8cD5HsAQ1G7nlePSiJurpaUET dSdggadDfq5zui4PCnZA5vD046xpdwWVCb0o5FVdxUYAnl38lXK98wyrBCWWwoFw4zsT lP9iU7cpetLOj8dw1hVEaR8UGQ+DvJG44hyh3OvbwIAoaqUMHzWPjjUuXAsAQkDlOAjm QQsA4P0nsP8izyCDNcjmNrQE8PIgOcwS2jCzk/IcTv5jzhmoYdiFJcZqEYpTMU/52kAk oyaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :mime-version:accept-language:in-reply-to:references:message-id:date :thread-index:thread-topic:subject:cc:to:from; bh=AVN/9P7Mm5CZ3jhMQadTuVEEYV28cfVWb+O0wF/7Bj0=; b=oI8329lPqJXDYquzfXwwYAW/sSeD+hVOGwcfJmvCQYvGYYFUbmtAtkeRp90Nus2eZB YAgQhOiVw3azY/9PCJy27hWYpUEtxOKt37RgsBjEgWiWMp6IOuThOsKrlc6St6KIW2rx xDHGeru8qr/5Riwjl1Om0eSYPXEgPGKEDzYTfy1sD8lMrIvJEbjCZtvfFsyZx1eC8krn ZF66NMi1J0CP8zCuhwY085y4TaTX8qsra6EwsNyctna8Ww9NtoEN04QLLCARgs9J5n2O C67OKf0Jbza5ICH0I+hemsDJvGIR+01ERENNWeI/X3y4o4f+Kp/pi+re3AabRf3Ocnbt 6ExA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id 5si7685536ejd.877.2022.01.31.18.05.24; Mon, 31 Jan 2022 18:05:49 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345759AbiA2Nl7 convert rfc822-to-8bit (ORCPT + 99 others); Sat, 29 Jan 2022 08:41:59 -0500 Received: from eu-smtp-delivery-151.mimecast.com ([185.58.85.151]:27588 "EHLO eu-smtp-delivery-151.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237017AbiA2Nl5 (ORCPT ); Sat, 29 Jan 2022 08:41:57 -0500 Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id uk-mta-132-kl2_1KnEM5W2_GB7pgWbZw-1; Sat, 29 Jan 2022 13:41:53 +0000 X-MC-Unique: kl2_1KnEM5W2_GB7pgWbZw-1 Received: from AcuMS.Aculab.com (fd9f:af1c:a25b:0:994c:f5c2:35d6:9b65) by AcuMS.aculab.com (fd9f:af1c:a25b:0:994c:f5c2:35d6:9b65) with Microsoft SMTP Server (TLS) id 15.0.1497.28; Sat, 29 Jan 2022 13:41:52 +0000 Received: from AcuMS.Aculab.com ([fe80::994c:f5c2:35d6:9b65]) by AcuMS.aculab.com ([fe80::994c:f5c2:35d6:9b65%12]) with mapi id 15.00.1497.028; Sat, 29 Jan 2022 13:41:52 +0000 From: David Laight To: "'michael@michaelkloos.com'" , Palmer Dabbelt , Paul Walmsley , Albert Ou CC: "linux-riscv@lists.infradead.org" , "linux-kernel@vger.kernel.org" Subject: RE: [PATCH v4] riscv: Fixed misaligned memory access. Fixed pointer comparison. Thread-Topic: [PATCH v4] riscv: Fixed misaligned memory access. Fixed pointer comparison. Thread-Index: AQHYFLeJevYJAUdx80uZ5JrJ5hRdzqx5/qSg Date: Sat, 29 Jan 2022 13:41:52 +0000 Message-ID: <7ef35550b4dd44ffabfd7ca1e0ee27fa@AcuMS.aculab.com> References: <20220129022448.37483-1-michael@michaelkloos.com> In-Reply-To: <20220129022448.37483-1-michael@michaelkloos.com> Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.107] MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=C51A453 smtp.mailfrom=david.laight@aculab.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: michael@michaelkloos.com ... > [v4] > > I could not resist implementing the optimization I mentioned in > my v3 notes. I have implemented the roll over of data by cpu > register in the misaligned fixup copy loops. Now, only one load > from memory is required per iteration of the loop. I nearly commented... ... > + /* > + * Fix Misalignment Copy Loop. > + * load_val1 = load_ptr[0]; > + * while (store_ptr != store_ptr_end) { > + * load_val0 = load_val1; > + * load_val1 = load_ptr[1]; > + * *store_ptr = (load_val0 >> {a6}) | (load_val1 << {a7}); > + * load_ptr++; > + * store_ptr++; > + * } > + */ > + REG_L t0, 0x000(a3) > + 1: > + beq t3, t6, 2f > + mv t1, t0 > + REG_L t0, SZREG(a3) > + srl t1, t1, a6 > + sll t2, t0, a7 > + or t1, t1, t2 > + REG_S t1, 0x000(t3) > + addi a3, a3, SZREG > + addi t3, t3, SZREG > + j 1b No point jumping to a conditional branch that jumps bak Make this a: bne t3, t6, 1b and move 1: down one instruction. (Or is the 'beq' at the top even possible - there is likely to be an earlier test for zero length copies.) > + 2: I also suspect it is worth unrolling the loop once. You lose the 'mv t1, t0' and one 'addi' for each word transferred. I think someone mentioned that there is a few clocks delay before the data from the memory read (REG_L) is actually available. On in-order cpu this is likely to be a full pipeline stall. So move the 'addi' up between the 'REG_L' and 'sll' instructions. (The offset will need to be -SZREG to match.) David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)