Received: by 2002:a05:7412:a9a8:b0:f9:92ae:e617 with SMTP id o40csp47064rdh; Wed, 20 Dec 2023 15:38:01 -0800 (PST) X-Google-Smtp-Source: AGHT+IE1kuX6Jy9qpuPND1Ia/VbsVtWdkkTkIqcuq5tqaOS5n1Hlp1MylkLd//JDYPjmdcWm2FAm X-Received: by 2002:a05:6a20:5d9c:b0:18f:97c:8a49 with SMTP id km28-20020a056a205d9c00b0018f097c8a49mr420483pzb.116.1703115481424; Wed, 20 Dec 2023 15:38:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703115481; cv=none; d=google.com; s=arc-20160816; b=ZXWjs4Ftl0YoaXuKv5aVLQTlPtscWvKWa6LoICxgxYaXYiTVXER0yO7oMHs94EWB5I 84xLZeLEdxctBUaXpN1L0t1YtgCIrIjhUYy/SroycZE67lbMRnEeTBCTMEdojT3F2bR3 +pdRvC6/Je4mX70KI52Le479DskauqAl5eY+dKckx9paEh1sV4kkxfXX1T2X95WlzmFk bF8Zn2YlhDcSFEkZRpmIoxhHuBu0uEgxLxgLu0DBEZBubUX26wMH7txl5S0Dc0cY9/fv wQwPZsI61MC5ProWnPhij+cDYJOIqBnBkzX1WPoMFoDKItouqv2yBl6Va9v4eRDQbANa i74g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :subject:date:from:dkim-signature; bh=SjqGV1Kb/P5zbkrU4HZj42U242Y4ucdb6cKzFhJu16E=; fh=WrYWo2NWGfDp1Q0L0IXsGjg4kDrSbEOOvn62+ocLtXY=; b=vPt1s67RjS7uTaEIBYWpCgf2C2kwm2qT/lS6rCojEs3jld5RLTzrlEdPYZ71UIFdjf W2zhXbW+UmrIGlxmbPkFqxmjBoAHyffL461DD04Fosv/ogpLIDbDNF5oBLgIAiswFzoo NFrs9L9HnTFaCtcm852J08mi5tJYMKISl+kHkiky4dg7KYVjWQJxsRVwX8fyerXgCopZ mhg/5Dx4auAmYtT+vslHvQMHhoRA/zoQn4JDHXFwcxwHejaPqKNWwaXW9VjAbhH9mz0Z 0oUCaZiqDssKbCIAe8GvIT13GyyNkzQHf4f87jAgUsJIuZXTP5qEWg7j5odBwcLY6W9i ZVsw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=uEH18OOc; spf=pass (google.com: domain of linux-kernel+bounces-7637-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-7637-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id z24-20020aa785d8000000b006ce51863712si470519pfn.330.2023.12.20.15.38.01 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Dec 2023 15:38:01 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-7637-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=uEH18OOc; spf=pass (google.com: domain of linux-kernel+bounces-7637-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-7637-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 1CB1528285D for ; Wed, 20 Dec 2023 23:38:01 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 652C94B5AE; Wed, 20 Dec 2023 23:37:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=rivosinc-com.20230601.gappssmtp.com header.i=@rivosinc-com.20230601.gappssmtp.com header.b="uEH18OOc" X-Original-To: linux-kernel@vger.kernel.org Received: from mail-oo1-f53.google.com (mail-oo1-f53.google.com [209.85.161.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 478104AF75 for ; Wed, 20 Dec 2023 23:37:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-oo1-f53.google.com with SMTP id 006d021491bc7-594196f5081so116720eaf.2 for ; Wed, 20 Dec 2023 15:37:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1703115463; x=1703720263; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=SjqGV1Kb/P5zbkrU4HZj42U242Y4ucdb6cKzFhJu16E=; b=uEH18OOcA3Ib+xRdxwKKVLD8ttrUwaJhrJjmEARabck0E/el5cpMVWJSaFWIKtQwaj Lrz1lmvMmDRxhVkosgx9s7Oz6VHxrIGezHu/f1tGDWBqbPye1rx9u24tQuNsOIHIZg6a tl09PtBIQ8ye92QGAHWqvEHPGwXcf3GB+foLqraJI3oNmDixitpSek/CLJE9f96bZLVu yUMMCIqkYjcC8MP2/ejvjoYQCbdxIlPBFlvANn/nRJZh/ikhsc/TGQCIau5k06aoS7UP ZujK9InSXVVfI4Rddb05TpdM9yP/wnOPLUlO2E1BacgyJmAVm4e0SMIT2Bdvu7y/zY8d hVIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1703115463; x=1703720263; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SjqGV1Kb/P5zbkrU4HZj42U242Y4ucdb6cKzFhJu16E=; b=VO/bcQwmY/RaC/KuYOOYFo4Ff9Ql0SovpK6mEvQyz6aKMH77utiNs7CyTcoHe1m3IJ xGCvuH/quT/zNGiG2x3RAoFsXdagv8LunozsGgabGF2CGTL731mPZeNjUpq3IEHmX7yk 6DT19rKv0V+b9KckVjSoi7vFPazyoQ2n1/HVqKhneFggaE8qai1BEBb6qqGj0J2twT+d PPo1MCwxjySxd7Vd30OCn7IX+4hg5BIJHw7kjqP8kQDBHs0JbLGA4K3CU62vpzerGBbp ZYMFR1cswWoc/aAMONzB7ltL9iH2K6trPAZ0QFtdR6kWJuCDzyT/gp97aHJAT2IkHtkd 22NA== X-Gm-Message-State: AOJu0Yziy6oLKsTDvC+DxInSsiZIexeMbbrBgX9jGm9s2IND6aa3vQly toOlzmH81APojSk35YjIMKVD8Q== X-Received: by 2002:a05:6870:2406:b0:204:1640:871 with SMTP id n6-20020a056870240600b0020416400871mr540390oap.33.1703115463452; Wed, 20 Dec 2023 15:37:43 -0800 (PST) Received: from charlie.ba.rivosinc.com ([64.71.180.162]) by smtp.gmail.com with ESMTPSA id k5-20020a056830150500b006d87e38f91asm132834otp.56.2023.12.20.15.37.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Dec 2023 15:37:43 -0800 (PST) From: Charlie Jenkins Date: Wed, 20 Dec 2023 15:37:39 -0800 Subject: [PATCH v13 1/5] asm-generic: Improve csum_fold Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <20231220-optimize_checksum-v13-1-a73547e1cad8@rivosinc.com> References: <20231220-optimize_checksum-v13-0-a73547e1cad8@rivosinc.com> In-Reply-To: <20231220-optimize_checksum-v13-0-a73547e1cad8@rivosinc.com> To: Charlie Jenkins , Palmer Dabbelt , Conor Dooley , Samuel Holland , David Laight , Xiao Wang , Evan Green , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Cc: Paul Walmsley , Albert Ou , Arnd Bergmann , David Laight X-Mailer: b4 0.12.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1703115460; l=1517; i=charlie@rivosinc.com; s=20231120; h=from:subject:message-id; bh=FtbING0Tg9V1myoJvDDTr8ph0mEbsa+4mbxe/ijBRBI=; b=xEtGAzIZTi0hssL5BrH22sQ/Z/VxvZj1DqZ2xODP/wyuVs3LLvuUYmbbKU/IWSuPywx+xI7Vd 9xXGrsLI9uQDaY3XpdcikT0hlogpiuc94AKvf2h4XD1x0mqkd4yUqVi X-Developer-Key: i=charlie@rivosinc.com; a=ed25519; pk=t4RSWpMV1q5lf/NWIeR9z58bcje60/dbtxxmoSfBEcs= This csum_fold implementation introduced into arch/arc by Vineet Gupta is better than the default implementation on at least arc, x86, and riscv. Using GCC trunk and compiling non-inlined version, this implementation has 41.6667%, 25% fewer instructions on riscv64, x86-64 respectively with -O3 optimization. Most implmentations override this default in asm, but this should be more performant than all of those other implementations except for arm which has barrel shifting and sparc32 which has a carry flag. Signed-off-by: Charlie Jenkins Reviewed-by: David Laight --- include/asm-generic/checksum.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/include/asm-generic/checksum.h b/include/asm-generic/checksum.h index 43e18db89c14..ad928cce268b 100644 --- a/include/asm-generic/checksum.h +++ b/include/asm-generic/checksum.h @@ -2,6 +2,8 @@ #ifndef __ASM_GENERIC_CHECKSUM_H #define __ASM_GENERIC_CHECKSUM_H +#include + /* * computes the checksum of a memory block at buff, length len, * and adds in "sum" (32-bit) @@ -31,9 +33,7 @@ extern __sum16 ip_fast_csum(const void *iph, unsigned int ihl); static inline __sum16 csum_fold(__wsum csum) { u32 sum = (__force u32)csum; - sum = (sum & 0xffff) + (sum >> 16); - sum = (sum & 0xffff) + (sum >> 16); - return (__force __sum16)~sum; + return (__force __sum16)((~sum - ror32(sum, 16)) >> 16); } #endif -- 2.43.0