Received: by 2002:a05:6a10:f3d0:0:0:0:0 with SMTP id a16csp1810981pxv; Sat, 10 Jul 2021 16:09:49 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxTzt7yHKtiVo5FsBqbXsZ5RB8TKu4VJg8yTUr7qr+y/FMTm9fzFapuA9LbA/ufwEiF/gXp X-Received: by 2002:a05:6402:4393:: with SMTP id o19mr55856409edc.263.1625958588895; Sat, 10 Jul 2021 16:09:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1625958588; cv=none; d=google.com; s=arc-20160816; b=YIgRqyvRqiEy34Hws01Uuq7N6KNG2T2NUDkLQ7q0FQVeBjIN8FykSEkhC/TKYpz5rL Ic2x1kTU93vqNnTHRH+/ja2IxaJe34piyt0d/rVCldygIGbgZlNBrh9JJ1vxxbwBkCs5 jTN68H5aaH0DK3rNCEb2B5Z2+7CEA3//i9iPC8OnYaAj/sGTiuTquIGz7JSCp/0u1ON8 PrzW1hGYPLB0twfad8dhYjzAQu6kGISQrBOcvi9EzTlye5ChZLXnhOuyOer4yaG5JzZM cUelrW2x2/yv/caZXLCalesuMbRSxvx3UnNVJY5HzSn1jvejCjzpFE/rNQyrat/sELMJ iFfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature:dkim-filter; bh=iI1FMzdW4VlFjZv/Q84Kju2c7nRPfQ1D8g1ol/lad8o=; b=xqcyjasqIHyYxChRXtl862LBUEbhU+Xrxe5RvhgzQTZc311CvygMGbN4oly/kLyIXc xhuf8oqAUJDVzLgHM/tkViSfccphG5A42kcTf58mV72ncAkiq+7GroavXbdqKP/q6CWT r1m0CCnBl2v9nYuEuNzeBgOtO3iyvHUTNjjt1PUjPVVNAS2im0223VEC0CNHj/t2o0LU ueLVDZe7nL2YnoDI0panAj6eWpBpWgLB61UpcuBDNNCMPgIhXc6gzPCX1FQVGWoAa108 SSLmLVvhIF3ge1DmLiThBfmCJwieC+mqWHwBuOMZCiY3uZ83iRJeRLXUd0KY0zLYnRlw ojvw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=mqjRWbf+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v16si12445598edy.327.2021.07.10.16.09.24; Sat, 10 Jul 2021 16:09:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=mqjRWbf+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229674AbhGJXLM (ORCPT + 99 others); Sat, 10 Jul 2021 19:11:12 -0400 Received: from linux.microsoft.com ([13.77.154.182]:60098 "EHLO linux.microsoft.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229515AbhGJXLM (ORCPT ); Sat, 10 Jul 2021 19:11:12 -0400 Received: from mail-pg1-f169.google.com (mail-pg1-f169.google.com [209.85.215.169]) by linux.microsoft.com (Postfix) with ESMTPSA id 37D0120B83DE; Sat, 10 Jul 2021 16:08:26 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 37D0120B83DE DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1625958506; bh=iI1FMzdW4VlFjZv/Q84Kju2c7nRPfQ1D8g1ol/lad8o=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=mqjRWbf+6qAnSEecmFTGrkQ4qqom1FTAb4ksGZNiN/l9ybsufLXb2fdmnzUrOn9iv VBotjDq18WMYTO+9tBtHZ/zAyPyDXG6RpSXXdccx3qppBIcdzo235HmRRnOdJyb2Cx F/SmxbSFCSiPaQgRH9gpclsfiL4ksZfbm2VKdfLc= Received: by mail-pg1-f169.google.com with SMTP id t9so14044877pgn.4; Sat, 10 Jul 2021 16:08:26 -0700 (PDT) X-Gm-Message-State: AOAM5323W1aUkn6WK13g07xhdTPQu2JUHvsIItidNpxTBjzoxKRk2Zoa TtDtho/7FLhSoFJNr37MsSU/i8EUhV3OXuwzQBc= X-Received: by 2002:a62:5b81:0:b029:32a:dfe:9bb0 with SMTP id p123-20020a625b810000b029032a0dfe9bb0mr4818627pfb.0.1625958505677; Sat, 10 Jul 2021 16:08:25 -0700 (PDT) MIME-Version: 1.0 References: <20210702123153.14093-1-mcroce@linux.microsoft.com> <20210710143109.fd5062902ef4d5d59e83f5bb@linux-foundation.org> In-Reply-To: <20210710143109.fd5062902ef4d5d59e83f5bb@linux-foundation.org> From: Matteo Croce Date: Sun, 11 Jul 2021 01:07:49 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v2 0/3] lib/string: optimized mem* functions To: Andrew Morton Cc: Linux Kernel Mailing List , Nick Kossifidis , Guo Ren , Christoph Hellwig , David Laight , Palmer Dabbelt , Emil Renner Berthing , Drew Fustini , linux-arch , Nick Desaulniers , linux-riscv Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jul 10, 2021 at 11:31 PM Andrew Morton wrote: > > On Fri, 2 Jul 2021 14:31:50 +0200 Matteo Croce wrote: > > > From: Matteo Croce > > > > Rewrite the generic mem{cpy,move,set} so that memory is accessed with > > the widest size possible, but without doing unaligned accesses. > > > > This was originally posted as C string functions for RISC-V[1], but as > > there was no specific RISC-V code, it was proposed for the generic > > lib/string.c implementation. > > > > Tested on RISC-V and on x86_64 by undefining __HAVE_ARCH_MEM{CPY,SET,MOVE} > > and HAVE_EFFICIENT_UNALIGNED_ACCESS. > > > > These are the performances of memcpy() and memset() of a RISC-V machine > > on a 32 mbyte buffer: > > > > memcpy: > > original aligned: 75 Mb/s > > original unaligned: 75 Mb/s > > new aligned: 114 Mb/s > > new unaligned: 107 Mb/s > > > > memset: > > original aligned: 140 Mb/s > > original unaligned: 140 Mb/s > > new aligned: 241 Mb/s > > new unaligned: 241 Mb/s > > Did you record the x86_64 performance? > > > Which other architectures are affected by this change? x86_64 won't use these functions because it defines __HAVE_ARCH_MEMCPY and has optimized implementations in arch/x86/lib. Anyway, I was curious and I tested them on x86_64 too, there was zero gain over the generic ones. The only architecture which will use all the three function will be riscv, while memmove() will be used by arc, h8300, hexagon, ia64, openrisc and parisc. Keep in mind that memmove() isn't anything special, it just calls memcpy() when possible (e.g. buffers not overlapping), and fallbacks to the byte by byte copy otherwise. In future we can write two functions, one which copies forward and another one which copies backward, and call the right one depending on the buffers position. Then, we could alias memcpy() and memmove(), as proposed by Linus: https://bugzilla.redhat.com/show_bug.cgi?id=638477#c132 Regards, -- per aspera ad upstream