Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp1407603pxb; Sun, 19 Sep 2021 16:58:14 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyAVAvuyPfPtcOVvanDkFcPuq94l5roSTv45t0Bt0tmd3r2Hlk4lyVMoXwPyC7SsO/rNXYQ X-Received: by 2002:a17:906:341b:: with SMTP id c27mr26226337ejb.61.1632095894186; Sun, 19 Sep 2021 16:58:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1632095894; cv=none; d=google.com; s=arc-20160816; b=GzSUXrE0iZSKK5AQFx72fVE0jbWysHDmkL4KCbrafyEdwPVlccjWDU9YQVXS3ijmJm 6miga6uUniXXrk/uv5Bxjh1lvzwSawTgrKtk+3bNX7Nef1yMRZQOEWqc5cSWw6D6c05b WNpVFvMuS/kYJ+jN1PP+IqwHz5gS0Cuk9lXFaLLaAaeU1EhaDL5VntURoNnVjv7fBwk9 ypkNBPpB4rhS6EVhj7y1E13SorBMFQKs65OykPXxQX8nkkzXHdZtaeoQd96NnhGWbDPZ uBcQhhfCLiFUBZp3zNeCKqxtzk2MZS7WgqjtEhgXNHsZQ2k89Yj0U/iMJtfY6RyUkavr /+tg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature:dkim-filter; bh=8vDM8D+5m8xPvdc2yp643jfDPpfQqKCVaC1Iss96USQ=; b=NnmblJYovbMOp9NzhxfglwRuwkY2CJ9euU9dcPDX/mxQQ+6+T2ARNmPkMJ5VQTG4P+ kTrPmy834GSGs3EAIveB0fmII2z+sNaTA0pKTQklnSaKbEsNmuC8xFIrlPaxUMTRrKkf xYsLYvuyHh02R2ue3rRiDxYYJA2YQ09xaw1eO2gx4Cb2XC8NgKmA8d/SYC8KK6uJhBu/ 43LwWbS54/E94tMvKlahedZk5gRir4TkFUv1I8GU3xXRQad7Gh5B/AuKsVp/6GGNSnST XmCw0yFAkasacIFoW4juE96bbojPzTDQlTzyeNt+HGwwbB3AZVYV24yIHirvhza/XjoJ xFLg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=E++GjQEu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c24si15536186ejj.636.2021.09.19.16.57.29; Sun, 19 Sep 2021 16:58:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=E++GjQEu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231491AbhISTP1 (ORCPT + 99 others); Sun, 19 Sep 2021 15:15:27 -0400 Received: from linux.microsoft.com ([13.77.154.182]:51280 "EHLO linux.microsoft.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229477AbhISTP0 (ORCPT ); Sun, 19 Sep 2021 15:15:26 -0400 Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) by linux.microsoft.com (Postfix) with ESMTPSA id 4919920B6C5D; Sun, 19 Sep 2021 12:14:01 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 4919920B6C5D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1632078841; bh=8vDM8D+5m8xPvdc2yp643jfDPpfQqKCVaC1Iss96USQ=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=E++GjQEuD8h86PIj0eFgMLPpGHo+gv/gwOx/P7NfLnQbdgY0bSOUHjJmuVz2mHiR8 pJrOCSAQkUIbMVEi9FSrp4IFoK4OGDyZfoaOTKb7WUwu7FT5yf7xMeIpTp0A0RjQbo VjBgxZQW3vwvDYnyb1hxag48GspctDRc9PPbi1K8= Received: by mail-pl1-f172.google.com with SMTP id l6so4846552plh.9; Sun, 19 Sep 2021 12:14:01 -0700 (PDT) X-Gm-Message-State: AOAM532pSjV2cviUpGb7jX9L8uZj/7rKhEKo1EyzMhrmR76k16fyFtEH cnRFC4WrnqvS5Iz0qi//ECdXKjGvPm+TqAQI+90= X-Received: by 2002:a17:90b:3447:: with SMTP id lj7mr3531940pjb.112.1632078840771; Sun, 19 Sep 2021 12:14:00 -0700 (PDT) MIME-Version: 1.0 References: <9a8137149a164a13a7a04d72b133ad3b@AcuMS.aculab.com> In-Reply-To: <9a8137149a164a13a7a04d72b133ad3b@AcuMS.aculab.com> From: Matteo Croce Date: Sun, 19 Sep 2021 21:13:24 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] riscv: use the generic string routines To: David Laight Cc: Guo Ren , Palmer Dabbelt , linux-riscv , Linux Kernel Mailing List , linux-arch , Paul Walmsley , Albert Ou , Atish Patra , Emil Renner Berthing , Akira Tsukamoto , Drew Fustini , Bin Meng , Christoph Hellwig Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Sep 13, 2021 at 1:35 PM David Laight wrote: > > > > These ended up getting rejected by Linus, so I'm going to hold off on > > > this for now. If they're really out of lib/ then I'll take the C > > > routines in arch/riscv, but either way it's an issue for the next > > > release. > > Agree, we should take the C routine in arch/riscv for common > > implementation. If any vendor what custom implementation they could > > use the alternative framework in errata for string operations. > > I though the asm ones were significantly faster because > they were less affected by read latency. > > (But they were horribly broken for misaligned transfers.) > I can get the same exact performance (and a very similar machine code) in C with this on top of the C memset implementation: --- a/arch/riscv/lib/string.c +++ b/arch/riscv/lib/string.c @@ -112,9 +112,12 @@ EXPORT_SYMBOL(__memmove); void *memmove(void *dest, const void *src, size_t count) __weak __alias(__memmove); EXPORT_SYMBOL(memmove); +#define BATCH 4 + void *__memset(void *s, int c, size_t count) { union types dest = { .as_u8 = s }; + int i; if (count >= MIN_THRESHOLD) { unsigned long cu = (unsigned long)c; @@ -138,8 +141,12 @@ void *__memset(void *s, int c, size_t count) } /* Copy using the largest size allowed */ - for (; count >= BYTES_LONG; count -= BYTES_LONG) - *dest.as_ulong++ = cu; + for (; count >= BYTES_LONG * BATCH; count -= BYTES_LONG * BATCH) { +#pragma GCC unroll 4 + for (i = 0; i < BATCH; i++) + dest.as_ulong[i] = cu; + dest.as_ulong += BATCH; + } } On the BeagleV the memset speed with the different batch size are: 1 (stock): 267 Mb/s 2: 272 Mb/s 4: 276 Mb/s 8: 276 Mb/s The problem with biggest batch size is that it will fallback to a single byte copy if the buffers are too small. Regards, -- per aspera ad upstream