Received: by 2002:a05:7412:8d11:b0:fa:4934:9f with SMTP id bj17csp429960rdb; Mon, 15 Jan 2024 01:45:18 -0800 (PST) X-Google-Smtp-Source: AGHT+IHupbhqVgAX8aCgMlCpKwOI7KBnMuJKoFAlvVPaBUMs8Cm6dcHJRqeyu4OMihTDmGuq4ZSk X-Received: by 2002:a17:907:160d:b0:a2d:feaa:63a2 with SMTP id cw13-20020a170907160d00b00a2dfeaa63a2mr212404ejd.170.1705311918779; Mon, 15 Jan 2024 01:45:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1705311918; cv=none; d=google.com; s=arc-20160816; b=tT4eYJhNaVy9v3hi/KwVISJbFgKHW8m0CeH/k0gepThdLaXvpxd935jDZ0iF7cbzeE Ilcga/uTUHCRmJUik62ZyM/bHmW25RWBND3NmzSNefQHY8LCVnlL2mOEokjLkX3WOMt2 MuaUXcfpTMqnVUIA2AMKFh6V4pR2hQpnGu0IPyRN0RezP7fsBCL9i9eNJcO41Hg41I0g pLCQ8P2IMJvUS+korr4bQX0OxMIZ9+SmiRiJOrrBMLje9XJDJmW7nuCYjijpDWs/qXH7 XTkw9T2gLwh0p2dXdxiL+TxBVei1fQvgVsTQvCvK8bEsCT+jLuBRdgRM0nRNObst3eMA nJPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=LuYUFC5cwP7fyrcH3anIXbSeLLMHKAFzLz19otUxXNg=; fh=DlmJuR6wwQKMZDEHlp504aEHD07s8CYIONf2sauIDG0=; b=i1hSKCAyqbEs34pSuDDCqNDDDJsUQ14xnhHFFrjZCkm1GuxX8ctouYEoarF3FHw28F k5D8pofKUedKmVDwzNK0X1+Xwr5gORK15/izRRs67fWoqFboc48miRZOnGLPcVAuV9Q7 LKWnF8zNst9OiZTV25tz07EiOlsuCZ7IPv8e65lddIAHBiiys/jBI2RkvcokoXPZI8SG cNUuv7bq/Kav905llbHdsHiPRoNhCw8fwbkrHw9XY3CPgN6Z+q9UZg4Dr4Sp/0NB17T6 tu1FWzCYS6d6StoTSgHPuEacS8Q9hcualQQAwjYq0wUdNTb3JeMSpTpUs7BlsctCjVWf Vptg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Ip6EgncI; spf=pass (google.com: domain of linux-kernel+bounces-25817-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-25817-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id e27-20020a170906045b00b00a236b237296si3619318eja.391.2024.01.15.01.45.18 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Jan 2024 01:45:18 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-25817-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Ip6EgncI; spf=pass (google.com: domain of linux-kernel+bounces-25817-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-25817-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 83A901F21E32 for ; Mon, 15 Jan 2024 09:45:18 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 185022BAE9; Mon, 15 Jan 2024 09:45:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Ip6EgncI" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4C4322BAE0 for ; Mon, 15 Jan 2024 09:45:11 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6AD4DC433C7; Mon, 15 Jan 2024 09:45:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1705311911; bh=9TfclVxenwhLNqqbdWKDJ6r7w6xibtjyrmpBf389HBo=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Ip6EgncIonnMjh4DWPhEZNCM9CBa6b26cayPDOiKD1GdFdIxUVfTWSYtYD30rv8E6 1T6lBUoykAvfmpXP6wRDCEcSxYlMKPWjAzLnlkh2qjfJf2wFLYBO/wizlJVnI2XUNy zBScSjHfZytimxRZBeNA6QxHqx0fW2dggJYZVzCD5mIVXU1zBhE/JCryjI/o02vhMt h0m3JQGLg0L0xQCGQPhE6MA4jCN/duW0N7FCBSmkX/FJa6k929yQnX7P4EgAH0dSWu JRkP4pSZmJwoegcc6yGFoh7QVDMAd4tGjzzlZo01tBo97mZSGA2Lh90cNrDKP3BJZm N1L+pnBnRU2+A== Date: Mon, 15 Jan 2024 17:32:20 +0800 From: Jisheng Zhang To: Paul Walmsley , Palmer Dabbelt , Albert Ou Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, Samuel Holland Subject: Re: [PATCH v2] riscv: select ARCH_HAS_FAST_MULTIPLIER Message-ID: References: <20231202135202.4071-1-jszhang@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20231202135202.4071-1-jszhang@kernel.org> On Sat, Dec 02, 2023 at 09:52:02PM +0800, Jisheng Zhang wrote: > Currently, riscv linux requires at least IMA, so all platforms have a > multiplier. And I assume the 'mul' efficiency is comparable or better > than a sequence of five or so register-dependent arithmetic > instructions. Select ARCH_HAS_FAST_MULTIPLIER to get slightly nicer > codegen. Refer to commit f9b4192923fa ("[PATCH] bitops: hweight() > speedup") for more details. > > In a simple benchmark test calling hweight64() in a loop, it got: > about 14% performance improvement on JH7110, tested on Milkv Mars. > > about 23% performance improvement on TH1520 and SG2042, tested on > Sipeed LPI4A and SG2042 platform. > > a slight performance drop on CV1800B, tested on milkv duo. Among all > riscv platforms in my hands, this is the only one which sees a slight > performance drop. It means the 'mul' isn't quick enough. However, the > situation exists on x86 too, for example, P4 doesn't have fast > integer multiplies as said in the above commit, x86 also selects > ARCH_HAS_FAST_MULTIPLIER. So let's select ARCH_HAS_FAST_MULTIPLIER > which can benefit almost riscv platforms. > > Samuel also provided some performance numbers: > On Unmatched: 20% speedup for __sw_hweight32 and 30% speedup for > __sw_hweight64. > On D1: 8% speedup for __sw_hweight32 and 8% slowdown for > __sw_hweight64. > > Signed-off-by: Jisheng Zhang > Reviewed-by: Samuel Holland > Tested-by: Samuel Holland Hi @Palmer, I saw this simple patch is missed in your for-next tree, could you please pick it up? Thanks in advance > --- > > since v1: > - fix typo in commit msg > - add some performance numbers provided by Samuel > - collect Reviewed-by and Tested-by tag > > arch/riscv/Kconfig | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig > index 95a2a06acc6a..e4834fa76417 100644 > --- a/arch/riscv/Kconfig > +++ b/arch/riscv/Kconfig > @@ -23,6 +23,7 @@ config RISCV > select ARCH_HAS_DEBUG_VIRTUAL if MMU > select ARCH_HAS_DEBUG_VM_PGTABLE > select ARCH_HAS_DEBUG_WX > + select ARCH_HAS_FAST_MULTIPLIER > select ARCH_HAS_FORTIFY_SOURCE > select ARCH_HAS_GCOV_PROFILE_ALL > select ARCH_HAS_GIGANTIC_PAGE > -- > 2.42.0 > > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv