Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp4266037pxj; Mon, 21 Jun 2021 18:10:16 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzOTt4fnXvQCgSsldndSMkC3fEbUgVUQ5gcTVhbxLuV6ajv+KPtMFUjYO29mraedEbtPvxc X-Received: by 2002:a02:6382:: with SMTP id j124mr1265718jac.72.1624324215985; Mon, 21 Jun 2021 18:10:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1624324215; cv=none; d=google.com; s=arc-20160816; b=DNXh2h4uq9DPE5c2uSfC0KGsJhzw0J+wTbfWOSzXLsh2xcb0qPHnMQrZaPtxCSAqOH XUD/5sbQMe+uGS/HTxe4n3rjm2YVfs1qobndb3PiIv9bhhEy4yc5G0X6DH2hX2D/FwBO yDrmJHNUGa2Jv9/pdtQcdZ+plr47aq2/u2gvJ0ZBXpPnmD1Qrd72A/J0WyL/fQkLcNo7 HHjvvXIpEg2wlzyuAowx5lLG/XQmQdG4FNrYeGjigo7zTCM4VvVWxbwX8kplWlWYQF+q C6kgNqAGlNZRWWsnLE1mMxKiojCgMzaN1dPzgfCsarELlM9Iwn3fGcAKi63g6P9dccN6 Eyuw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:message-id:references:in-reply-to :organization:subject:cc:to:from:date:content-transfer-encoding :mime-version:dkim-signature; bh=+Jw1HAHQiDllcseauZluKb8PNJNrrsNEKW4hVyyHJpQ=; b=B0m67HZHWkt+e8xbOU9Y/ChJ4+I8gguA6okruHgvwvisO0tifj7IsEVzDSU9pLNfPO KUAY4CH8UXKsgvwAIei/UiV0WfWnTjArp2fB/K3Xziw2AHa6mopwUGOCzbMrsb6KVRMM /nHFcFIZGIWLDcwLga15tMME0IeTEMC+RGC4xtI55CJRPhwrcI3ES63JEXRoQ2z5dvke NIAfiWdIgnAMIwzsH6nSdbol0ECTkL5VKKXxtpc12KTHhrfVt9/Ylu5PGmjOcgNFy643 2YHBGKh9AYcaZGkJgWbVSdY95ffmlQaMe1P8bovk1R54hJpC/sOYclFkEB0ADUThUCFp I6pg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail (test mode) header.i=@ics.forth.gr header.s=av header.b=XSqG8GUt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ics.forth.gr Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j1si15208631ila.160.2021.06.21.18.10.03; Mon, 21 Jun 2021 18:10:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=fail (test mode) header.i=@ics.forth.gr header.s=av header.b=XSqG8GUt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ics.forth.gr Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230217AbhFVBKK (ORCPT + 99 others); Mon, 21 Jun 2021 21:10:10 -0400 Received: from mailgate.ics.forth.gr ([139.91.1.2]:19550 "EHLO mailgate.ics.forth.gr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229663AbhFVBKK (ORCPT ); Mon, 21 Jun 2021 21:10:10 -0400 Received: from av3.ics.forth.gr (av3in.ics.forth.gr [139.91.1.77]) by mailgate.ics.forth.gr (8.15.2/ICS-FORTH/V10-1.8-GATE) with ESMTP id 15M17rT4036030 for ; Tue, 22 Jun 2021 04:07:53 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; d=ics.forth.gr; s=av; c=relaxed/simple; q=dns/txt; i=@ics.forth.gr; t=1624324068; x=1626916068; h=From:Sender:Reply-To:Subject:Date:Message-ID:To:Cc:MIME-Version:Content-Type: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=Q+eQRJ20he4Y2U54oc/fPHuAWqGk0e4b8CGa0ma1wzs=; b=XSqG8GUt1WyotyIfHlNwbGteMjQzpf8PXDQ2RFocwrKD3CtIlhCrWm8GANr33+sZ yjVxt3o7QDYNwx+qwfophoWurLsqJRgHzSmZpR1XfBicwqE2fIkxkATp5znfnF7K Oo/l9JLUr5wS2L7TaUEJglcSYQlidlZh/BMLJQ01r6hYT8RgHbqVbmsHin1oGRvF rrSkd/KKqEUyM0tZhVz+3xeOQBw4l8Q4kZJHutBv3FIUbHJrmLTR++S1iQhk58mj 2QT+5dd5IqXSn1d5R9Kp7JbS6VrKisJscyKq6LZRtoAVMc6+BYw5H0RqpYaZyA7r JSayz6ILynmyQ1ykwEDO1A==; X-AuditID: 8b5b014d-96ef2700000067b6-6d-60d137e4120b Received: from enigma.ics.forth.gr (enigma.ics.forth.gr [139.91.151.35]) by av3.ics.forth.gr (Symantec Messaging Gateway) with SMTP id 0E.60.26550.4E731D06; Tue, 22 Jun 2021 04:07:48 +0300 (EEST) X-ICS-AUTH-INFO: Authenticated user: at ics.forth.gr MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Date: Tue, 22 Jun 2021 04:07:47 +0300 From: Nick Kossifidis To: Matteo Croce Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org, Paul Walmsley , Palmer Dabbelt , Albert Ou , Atish Patra , Emil Renner Berthing , Akira Tsukamoto , Drew Fustini , Bin Meng , David Laight , Guo Ren Subject: Re: [PATCH v3 3/3] riscv: optimized memset Organization: FORTH In-Reply-To: <20210617152754.17960-4-mcroce@linux.microsoft.com> References: <20210617152754.17960-1-mcroce@linux.microsoft.com> <20210617152754.17960-4-mcroce@linux.microsoft.com> Message-ID: <17cd289430f08f2b75b7f04242c646f6@mailhost.ics.forth.gr> X-Sender: mick@mailhost.ics.forth.gr User-Agent: Roundcube Webmail/1.3.16 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprOIsWRmVeSWpSXmKPExsXSHT1dWfeJ+cUEg1+bFC22vbvKYrH19yx2 i0UrvrNYTO2Jt9ixdDOTxb0Vy9gtXuxtZLF4smYmo0XHrq8sFpd3zWGz2Pa5hc3i4q/5jBYv L/cwW7TN4nfg8+ifPYXN493vZYweb16+ZPE43PGF3aOj7x+Lx85Zd9k9Nq3qZPP4tf0ok8fm JfUel5qvs3t83iTn0X6gmymAJ4rLJiU1J7MstUjfLoErY8P7xewFn3gqWo+lNTC+4exi5OSQ EDCReLd8N3sXIxeHkMBRRom/vy4wQiRMJWbv7QSzeQUEJU7OfMICYjMLWEhMvbKfEcKWl2je OpsZxGYRUJXY8fUIO4jNJqApMf/SQbB6EQFdiYsfDoMtYBaYziLxq3c3G0hCWMBYYsHylUwg Nr+AsMSnuxdZQWxOAQeJo8fXgdlCAqUSq08cYYU4wkXizIqpzBDHqUh8+P0AaCgHhyiQvXmu 0gRGwVlITp2F5NRZSE5dwMi8ilEgscxYLzO5WC8tv6gkQy+9aBMjOO4YfXcw3t78Vu8QIxMH 4yFGCQ5mJRHemykXEoR4UxIrq1KL8uOLSnNSiw8xSnOwKInz8upNiBcSSE8sSc1OTS1ILYLJ MnFwSjUwNTmumb1hzpHkX257uzV+znTdMInr1NPYsL2aOUzVVcL9XQqySf3u5mW+Ciu/Vm8t uHhZvmBN906xfnfXqj8yN4Tib97tW/mN857FcZblbCKWc+wY/vr8nmQixiUss+gL1/X5Se93 uHYd+ydy8u/Bo1qmb11CQiZPfVGc38+0vPQzj4TU1zi7V0rnapdZer85l7an3zbMb4Xbh+Ml Og1ic9bJ3Ei+tP6t68cKjmfONhvcjX6t6ftvsf2r7sErS5UNvBJO69lvNa+ZPX1Rq6/MR6da rt+TODsWTgwN6rhoM+n//5UBxTd/pJetttjlVilyIs1kxYHd/9Tt1JIuhGzlUta+bx//N4Pl duVON7sTSizFGYmGWsxFxYkAJG208yoDAAA= Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Στις 2021-06-17 18:27, Matteo Croce έγραψε: > + > +void *__memset(void *s, int c, size_t count) > +{ > + union types dest = { .u8 = s }; > + > + if (count >= MIN_THRESHOLD) { > + const int bytes_long = BITS_PER_LONG / 8; You could make 'const int bytes_long = BITS_PER_LONG / 8;' and 'const int mask = bytes_long - 1;' from your memcpy patch visible to memset as well (static const...) and use them here (mask would make more sense to be named as word_mask). > + unsigned long cu = (unsigned long)c; > + > + /* Compose an ulong with 'c' repeated 4/8 times */ > + cu |= cu << 8; > + cu |= cu << 16; > +#if BITS_PER_LONG == 64 > + cu |= cu << 32; > +#endif > + You don't have to create cu here, you'll fill dest buffer with 'c' anyway so after filling up enough 'c's to be able to grab an aligned word full of them from dest, you can just grab that word and keep filling up dest with it. > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > + /* Fill the buffer one byte at time until the destination > + * is aligned on a 32/64 bit boundary. > + */ > + for (; count && dest.uptr % bytes_long; count--) You could reuse & mask here instead of % bytes_long. > + *dest.u8++ = c; > +#endif I noticed you also used CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS on your memcpy patch, is it worth it here ? To begin with riscv doesn't set it and even if it did we are talking about a loop that will run just a few times to reach the alignment boundary (worst case scenario it'll run 7 times), I don't think we gain much here, even for archs that have efficient unaligned access.