Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp5196140pxb; Sun, 13 Feb 2022 11:13:56 -0800 (PST) X-Google-Smtp-Source: ABdhPJzI+UAj0oWQ6o7yGWAfKKWKuPbVUmcWD3kdBdcvt/NY8VhsvTXZw7H4J8E4VNwXe4fI+dix X-Received: by 2002:a63:a550:: with SMTP id r16mr2142471pgu.270.1644779636538; Sun, 13 Feb 2022 11:13:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644779636; cv=none; d=google.com; s=arc-20160816; b=z/nnl1lariCM5squvY5lhrHMEfmY8PQLfnQ9dsA5K9jHBmCWh5gGT2/s8ieRyNN/BX +8f4SDDiEJ3rOtQRUOOQPQIcshTSwilaFJbmyr1ZQleSgZisn5jWFamRKsq5CicdSNWi qAFEjDVrUeemzUpVqUGyb5gAQSvvMvv92G4yOr7+q1EHYKVM9L4o8IDdy9exhYdKkN+s q0UCao7hY6RMxoN9rO2om5Eyk+5w1n/YsgGKuYDahcZI046OMwJhfX00328CRezaL1xS rEOCn65LJvP9rpozEgFDzBLTmnOUoHmZ947RYnh6KVK2W2lz7SyGLqVzfzhXtzJhNmbi 4bNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :mime-version:accept-language:in-reply-to:references:message-id:date :thread-index:thread-topic:subject:cc:to:from; bh=bPptWN8kEADNo1wn4gCo1zlXhPWJVvSehx05XXNl6dk=; b=LUcLomO+Wch1YBqj81hw57W4VBsGf8AG9yBL+rOMW5gkEklrasUtlhY/Lh8fXHYB2R yZOcobFvXDqQw50vj1yDaEVOfy/iEs8ur7jdFFWXN2dkS7kI4bGWoc+WGtXf03HFzqoQ BVqTgIv+UBqj0NhzGqbXrY9VIZGREYxnsLwH9lXubitL9SBuvwsOw6b+HXcfQCiCt08G ZHXWt872jij3V3GuzPzBisI8aGERug1vZ2MCURAhdu+nbym8zhbTZrdstpTvcAnHTcGL EeE+yVIVD15DjjUZwL/ftOXYqo5eGaESU7+YL/GUuYGZWFuIDQyk218JdBl9/QAjvr0D a0/g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e15si24801289pfi.165.2022.02.13.11.13.40; Sun, 13 Feb 2022 11:13:56 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232954AbiBMCjR convert rfc822-to-8bit (ORCPT + 99 others); Sat, 12 Feb 2022 21:39:17 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:41916 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230061AbiBMCjQ (ORCPT ); Sat, 12 Feb 2022 21:39:16 -0500 Received: from eu-smtp-delivery-151.mimecast.com (eu-smtp-delivery-151.mimecast.com [185.58.86.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 0234060067 for ; Sat, 12 Feb 2022 18:39:11 -0800 (PST) Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id uk-mta-267-jz20NOQ6P6ya1sjHH6f2FQ-1; Sun, 13 Feb 2022 02:39:08 +0000 X-MC-Unique: jz20NOQ6P6ya1sjHH6f2FQ-1 Received: from AcuMS.Aculab.com (fd9f:af1c:a25b:0:994c:f5c2:35d6:9b65) by AcuMS.aculab.com (fd9f:af1c:a25b:0:994c:f5c2:35d6:9b65) with Microsoft SMTP Server (TLS) id 15.0.1497.28; Sun, 13 Feb 2022 02:39:06 +0000 Received: from AcuMS.Aculab.com ([fe80::994c:f5c2:35d6:9b65]) by AcuMS.aculab.com ([fe80::994c:f5c2:35d6:9b65%12]) with mapi id 15.00.1497.028; Sun, 13 Feb 2022 02:39:06 +0000 From: David Laight To: 'Christophe Leroy' , "David S. Miller" , Jakub Kicinski CC: "linux-kernel@vger.kernel.org" , "linuxppc-dev@lists.ozlabs.org" , "netdev@vger.kernel.org" Subject: RE: [PATCH] net: Remove branch in csum_shift() Thread-Topic: [PATCH] net: Remove branch in csum_shift() Thread-Index: AQHYHyQqmTo4K/pb5UWdDmTfE7rfRayQxWFw Date: Sun, 13 Feb 2022 02:39:06 +0000 Message-ID: <7f16910a8f63475dae012ef5135f41d1@AcuMS.aculab.com> References: In-Reply-To: Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.107] MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=C51A453 smtp.mailfrom=david.laight@aculab.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Christophe Leroy > Sent: 11 February 2022 08:48 > > Today's implementation of csum_shift() leads to branching based on > parity of 'offset' > > 000002f8 : > 2f8: 70 a5 00 01 andi. r5,r5,1 > 2fc: 41 a2 00 08 beq 304 > 300: 54 84 c0 3e rotlwi r4,r4,24 > 304: 7c 63 20 14 addc r3,r3,r4 > 308: 7c 63 01 94 addze r3,r3 > 30c: 4e 80 00 20 blr > > Use first bit of 'offset' directly as input of the rotation instead of > branching. > > 000002f8 : > 2f8: 54 a5 1f 38 rlwinm r5,r5,3,28,28 > 2fc: 20 a5 00 20 subfic r5,r5,32 > 300: 5c 84 28 3e rotlw r4,r4,r5 > 304: 7c 63 20 14 addc r3,r3,r4 > 308: 7c 63 01 94 addze r3,r3 > 30c: 4e 80 00 20 blr > > And change to left shift instead of right shift to skip one more > instruction. This has no impact on the final sum. > > 000002f8 : > 2f8: 54 a5 1f 38 rlwinm r5,r5,3,28,28 > 2fc: 5c 84 28 3e rotlw r4,r4,r5 > 300: 7c 63 20 14 addc r3,r3,r4 > 304: 7c 63 01 94 addze r3,r3 > 308: 4e 80 00 20 blr That is ppc64. What happens on x86-64? Trying to do the same in the x86 ipcsum code tended to make the code worse. (Although that test is for an odd length fragment and can just be removed.) David > > Signed-off-by: Christophe Leroy > --- > include/net/checksum.h | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/include/net/checksum.h b/include/net/checksum.h > index 5218041e5c8f..9badcd5532ef 100644 > --- a/include/net/checksum.h > +++ b/include/net/checksum.h > @@ -83,9 +83,7 @@ static inline __sum16 csum16_sub(__sum16 csum, __be16 addend) > static inline __wsum csum_shift(__wsum sum, int offset) > { > /* rotate sum to align it with a 16b boundary */ > - if (offset & 1) > - return (__force __wsum)ror32((__force u32)sum, 8); > - return sum; > + return (__force __wsum)rol32((__force u32)sum, (offset & 1) << 3); > } > > static inline __wsum > -- > 2.34.1 - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)