Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp3365691imm; Thu, 17 May 2018 07:40:47 -0700 (PDT) X-Google-Smtp-Source: AB8JxZor/HGyqJR5K+RPLuQyT0d1OIMV5n6eimyHxiiVSpMrqbc0ZBREVm3HKKS+WBXje117458e X-Received: by 2002:aa7:828c:: with SMTP id s12-v6mr5459748pfm.136.1526568047093; Thu, 17 May 2018 07:40:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526568047; cv=none; d=google.com; s=arc-20160816; b=joPzWgvYDdKQ8mBbzNF58PjgissD/MsEx5uFXwa7YSY5rSBGzl3l0mDPiyvyantCsF sYaN3SPKTpe8am6kfVOXucsraCx9ZEFjCS0xYM/ojuw7jJq0l00XmaWU0k/gkKVetiSQ Yc9lif75Ok9n5j3v3LExOrBIAEbOQ/MpwRNmrfJAj/hlrlnwF2HPWvF3kJGhwwNr8pew Id2xZC/YLPFxOKExd3GJoB+8J/J/hmONzB2hylhWdxGvJxNJXi6da66mmC0DCOXsHXVo L6YRpgbc8WzXlFEhlnW6bwl2pmlbJxT+V8kvs6pxM2RQ8u3nBMjuo6Ri4p7AV2KySRAD Wvbg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=pVIDCDuHd5GSW3+oqpdVDg4882pUraKFb4+8KdrA35c=; b=GtxW0K/xsSwjdQaUxrpxJASjBDZIbH/eNaAnYzi05+GYmpMlQuHp7SOzVBcNy6iRtX jd/VEAIJgFEGxbkpdVGQHSEKL74T0BYGLlFI4SiAWYsOl8V2mTCGrcLbQIayusI60c5y nVlFlukGZNbZBuNmQ465i/+D3IhD6kVCyfIp3nI9eYj2dG0O/47/iNKiHkPmbDp2ZTrP FA1iY5zTGQr7MxtsdSxyl8UgtfK8tRKVERXzSMeyOFIwBJ55/Rq+AHR21jvcAuZ08jTA VK+Nl3oe6Ui0lpQWp2FzLIdKfsN1wEaIREoTCkDmTPXAAkrBOR+68yMgaNEh7RoiSZQG cnNg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u74-v6si3861065pgc.186.2018.05.17.07.40.30; Thu, 17 May 2018 07:40:47 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752416AbeEQOjA (ORCPT + 99 others); Thu, 17 May 2018 10:39:00 -0400 Received: from gate.crashing.org ([63.228.1.57]:51305 "EHLO gate.crashing.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751481AbeEQOi7 (ORCPT ); Thu, 17 May 2018 10:38:59 -0400 Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id w4HEcB0v008458; Thu, 17 May 2018 09:38:12 -0500 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id w4HEcAeM008457; Thu, 17 May 2018 09:38:10 -0500 Date: Thu, 17 May 2018 09:38:10 -0500 From: Segher Boessenkool To: Christophe Leroy Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Scott Wood , Shile Zhang , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] Revert "powerpc/64: Fix checksum folding in csum_add()" Message-ID: <20180517143810.GV17342@gate.crashing.org> References: <20180410063437.217D2653BC@po15720vm.idsi0.si.c-s.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180410063437.217D2653BC@po15720vm.idsi0.si.c-s.fr> User-Agent: Mutt/1.4.2.3i Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 10, 2018 at 08:34:37AM +0200, Christophe Leroy wrote: > This reverts commit 6ad966d7303b70165228dba1ee8da1a05c10eefe. > > That commit was pointless, because csum_add() sums two 32 bits > values, so the sum is 0x1fffffffe at the maximum. > And then when adding upper part (1) and lower part (0xfffffffe), > the result is 0xffffffff which doesn't carry. > Any lower value will not carry either. > > And behind the fact that this commit is useless, it also kills the > whole purpose of having an arch specific inline csum_add() > because the resulting code gets even worse than what is obtained > with the generic implementation of csum_add() :-) > And the reverted implementation for PPC64 gives: > > 0000000000000240 <.csum_add>: > 240: 7c 84 1a 14 add r4,r4,r3 > 244: 78 80 00 22 rldicl r0,r4,32,32 > 248: 7c 80 22 14 add r4,r0,r4 > 24c: 78 83 00 20 clrldi r3,r4,32 > 250: 4e 80 00 20 blr If you really, really, *really* want to optimise this you could make it: rldimi r3,r3,0,32 rldimi r4,r4,0,32 add r3,r3,r4 srdi r3,r3,32 blr which is the same size, but has a shorter critical path length. Very analogous to how you fold 64->32. Segher