Received: by 2002:a05:7412:e794:b0:fa:551:50a7 with SMTP id o20csp15043rdd; Mon, 8 Jan 2024 15:57:52 -0800 (PST) X-Google-Smtp-Source: AGHT+IE2m4U5Xst+uER4u1r0dJQ5IffpWsn8rKyGZjNr9Pnqq0cVVUORdCQ/mqrWXHftAzWz7zmD X-Received: by 2002:a17:902:ec82:b0:1d5:13b0:cf0f with SMTP id x2-20020a170902ec8200b001d513b0cf0fmr5196585plg.65.1704758272318; Mon, 08 Jan 2024 15:57:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704758272; cv=none; d=google.com; s=arc-20160816; b=jswSXXZGyEACX2+Nz++Z2/F7gJUnmoqYhiJ196lMsZwhg5end1fd4nsSVtS42wHfoY Rhv/NSFEA1anQq/h5RXvQaVZz8vgq+y/HPXEwG6RhHZ89H6Y1QtB1O7Tb3XR4UqjlE+z GuSjBiisBaTEoH3Su18+yIzE/5bGzbFMIkJ9osn66iUFbJe4/KjW4hNsjfbeTqhChAHr WljOpmUg4/WY2AAkqDQ0hJNspZShophsgjwsXGFqXgCGiURbqRON4tL6StYtLCYaflh4 XVCk55fgOEFVDOZdeMeHbiVThauv3IuaLQzTDsLbYi2AYdOGkn59/NLZ84XMdFHTEw6k u4Vg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :subject:date:from:dkim-signature; bh=SjqGV1Kb/P5zbkrU4HZj42U242Y4ucdb6cKzFhJu16E=; fh=/2pBhCeCxif4vz26VltnBSv5BVMDpayc9qNl8XVXCvk=; b=wmPMZBx2R0JoTEBaTmuTD4ijvx2nyi20k2nnEIl+e9SzkOnWCMntO+hPoX6HDnsqz9 R/+KYx2Ts9u3bJuqtfa0EnIKyKK7qo7V6Cma7fQ1xBU+257GzKDcTPMN7f59dN+UNKKx /PF3N3CHEtv6fKaYcb6d0wuraaRk3cXXQmic+UqSs2LyUWymiqLQczyURHQZ7tAx9l/U 8+5EI17ScYMwkGHp7GrJcWs7n3z9blebzbsAEsTCMXLMQUXyR2NFqHCq1doDacIFMLmZ 6+5vg5vq57EuiH6KxsD8l1uk9djU8tVkNYS+66z3bDYjvsS2Ckt09Clj+iq1Z9iBYPbS rt0Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=p4TRuC9L; spf=pass (google.com: domain of linux-kernel+bounces-20198-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-20198-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [147.75.48.161]) by mx.google.com with ESMTPS id u15-20020a17090341cf00b001d4b3f51b85si575692ple.484.2024.01.08.15.57.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jan 2024 15:57:52 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-20198-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) client-ip=147.75.48.161; Authentication-Results: mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=p4TRuC9L; spf=pass (google.com: domain of linux-kernel+bounces-20198-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-20198-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 8A9D8B22E9F for ; Mon, 8 Jan 2024 23:57:43 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id D3F3C56756; Mon, 8 Jan 2024 23:57:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=rivosinc-com.20230601.gappssmtp.com header.i=@rivosinc-com.20230601.gappssmtp.com header.b="p4TRuC9L" Received: from mail-ot1-f49.google.com (mail-ot1-f49.google.com [209.85.210.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7124056477 for ; Mon, 8 Jan 2024 23:57:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-ot1-f49.google.com with SMTP id 46e09a7af769-6dddc000795so483328a34.1 for ; Mon, 08 Jan 2024 15:57:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1704758248; x=1705363048; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=SjqGV1Kb/P5zbkrU4HZj42U242Y4ucdb6cKzFhJu16E=; b=p4TRuC9LkJ6tM1KYfE3Mbxxcvqxi2XY3/eGr7MiprhIIG+4z2Ty3UHB6EHk2/bWPta BzvKxIuMk8mVKbMTiQT51hlvDpA5XY3DqOWwp02fxSidGG71M71GwLPYq8WlsU7Ej6D0 JvTqP+Q9iPY2aarhxUPyRrDn0uJjNPbiBEqBX/JCPxJvAoW7QtLDaqAw2JnvGkLbG5Xp qStTdyLk8EBHbspDliOSJYGwvTK0FkwEgze4RTijj+IX5KH8f7yaQU8KgNqnZ80aLWsK nOybFn3RuTlkUGeoFCQXMMI+kzhYc6owYq7QBwd17OIM9U1Sdcjy4sN1dhneuG/pNx2Z 2urA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704758248; x=1705363048; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SjqGV1Kb/P5zbkrU4HZj42U242Y4ucdb6cKzFhJu16E=; b=kl95joMDYWSEHkN5rZGJZb3t8IKjoRonC4TRbik1M4cysYN7Sxe3tN8OOF9MVZ6ojQ nTPhzLY+eYt2fuUBel/K9yKhKHCJHh+7hG7EpHsSctr59dbRn34uceEmnXnjjmZNA3KM ywE2TgI2MjqGGgmj3rKZo0f07phiC2JTS2FwK+olFU0X3qntMMfWgtoz+xD60V/48aFu 4tFV03eX5/UIKmfVdhyBax9Ala6MqcfKMOMUsf3U9aVDUgAqzLADz0fJv2rCBZKnlC/x g7dcUEpSJcRghWIt01gBl12tbvkAOqEyS96zk3T8/MDhv7FMooqB60TVEx8uZhdkE14t sEoQ== X-Gm-Message-State: AOJu0YxD/HV/FpD2ECG+7lMXi5OV0le2WbAS6MVsUrgNXntGZ5dycmDs EyiBcTxCCAIe/eqViTVLfEJWt6F1H1ERZQ== X-Received: by 2002:a05:6870:a693:b0:203:f6d7:56ac with SMTP id i19-20020a056870a69300b00203f6d756acmr3820090oam.41.1704758248550; Mon, 08 Jan 2024 15:57:28 -0800 (PST) Received: from charlie.ba.rivosinc.com ([64.71.180.162]) by smtp.gmail.com with ESMTPSA id ti5-20020a056871890500b002043b415eaasm206961oab.29.2024.01.08.15.57.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jan 2024 15:57:28 -0800 (PST) From: Charlie Jenkins Date: Mon, 08 Jan 2024 15:57:02 -0800 Subject: [PATCH v15 1/5] asm-generic: Improve csum_fold Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <20240108-optimize_checksum-v15-1-1c50de5f2167@rivosinc.com> References: <20240108-optimize_checksum-v15-0-1c50de5f2167@rivosinc.com> In-Reply-To: <20240108-optimize_checksum-v15-0-1c50de5f2167@rivosinc.com> To: Charlie Jenkins , Palmer Dabbelt , Conor Dooley , Samuel Holland , David Laight , Xiao Wang , Evan Green , Guo Ren , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Cc: Paul Walmsley , Albert Ou , Arnd Bergmann , David Laight X-Mailer: b4 0.12.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1704758245; l=1517; i=charlie@rivosinc.com; s=20231120; h=from:subject:message-id; bh=FtbING0Tg9V1myoJvDDTr8ph0mEbsa+4mbxe/ijBRBI=; b=7r6kXj0tvZ3nmSEETfM1OlN4zd7puktubukkDWBtnhShCRSbOxjwiowPmVGaKKYKRLlM376Yq ZqHkuglS4CMDfUYCDRQMLWt+N1kY6Y8vCWegUTOjouX59Y0U3WlA5rI X-Developer-Key: i=charlie@rivosinc.com; a=ed25519; pk=t4RSWpMV1q5lf/NWIeR9z58bcje60/dbtxxmoSfBEcs= This csum_fold implementation introduced into arch/arc by Vineet Gupta is better than the default implementation on at least arc, x86, and riscv. Using GCC trunk and compiling non-inlined version, this implementation has 41.6667%, 25% fewer instructions on riscv64, x86-64 respectively with -O3 optimization. Most implmentations override this default in asm, but this should be more performant than all of those other implementations except for arm which has barrel shifting and sparc32 which has a carry flag. Signed-off-by: Charlie Jenkins Reviewed-by: David Laight --- include/asm-generic/checksum.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/include/asm-generic/checksum.h b/include/asm-generic/checksum.h index 43e18db89c14..ad928cce268b 100644 --- a/include/asm-generic/checksum.h +++ b/include/asm-generic/checksum.h @@ -2,6 +2,8 @@ #ifndef __ASM_GENERIC_CHECKSUM_H #define __ASM_GENERIC_CHECKSUM_H +#include + /* * computes the checksum of a memory block at buff, length len, * and adds in "sum" (32-bit) @@ -31,9 +33,7 @@ extern __sum16 ip_fast_csum(const void *iph, unsigned int ihl); static inline __sum16 csum_fold(__wsum csum) { u32 sum = (__force u32)csum; - sum = (sum & 0xffff) + (sum >> 16); - sum = (sum & 0xffff) + (sum >> 16); - return (__force __sum16)~sum; + return (__force __sum16)((~sum - ror32(sum, 16)) >> 16); } #endif -- 2.43.0