Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp4467309pxj; Wed, 12 May 2021 06:21:42 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwYWii0cu+NsIJFGx4sPnkT18Hi8EErx9TsJ0UlNFby0Sts81b6e4lymkxRjoohyo3fABNZ X-Received: by 2002:a17:906:2616:: with SMTP id h22mr38524090ejc.126.1620825702387; Wed, 12 May 2021 06:21:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1620825702; cv=none; d=google.com; s=arc-20160816; b=A/1dn7ekZr0FLEMp5VTroLS6yZJWGRGdfNsmMlmY1zqS7NfylDYrBltmZkuLY+pauR 1hHByteg3179VCTLzd8hrZZ8H9HVNJ1HNbGhoBtSgRCMep8wnQk01bUCxwBB8kEJNAqS WA24wVFaoU5QGoMCXETLahKM2/BpMtosm8425MCN+x8B9WtOIRaWrCeFrW58/I7oNpn/ q9EM1IAvOarfx9dKsEKshrXvVknQkXAJ8q4PCXKnHML/B+LGtzBYoBJ08mjata2Dde/V HWyhP85WZ1GxChS2RwnYec46TggWs5ZEBQ0QPf4H8XvIEl1rZ9VhF22BfdGBUZ70L+mL eVPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=z0SFYVza0NNbT0SFjUqr0Sc2IU/C4vStoUOlALZj6KQ=; b=Q810/KB3duETe9N2Azb8CPYQEGM2/lvpSGqkKol0EAzwCAh6NR1WrMbJXmNcdHcszM ooT3qmcr6ivgM1tz0FDFTqHseoZrQ35Sw8Sp9Lxz8bUUg7PASKzPLL4ne/C/FAeRjipq JzSZEShfLkdFAiCE6xaTBcfNipTgUdNgyT+xOy40HunM3suCwpObioUDCa4d2wNyVgyd Md+khHcRUXJHbnBua/AyHSvZQvUYUPoTG4249NkAE+AFBt198OrCuF8Bpd/bA6Ethq2x 7L8U5/2YXf+xpU1x6uwPgE1Y4Ma7zAAUXeHwgwGsgCvO7C/kJsjEqL7g06bfk0Sz4UGu pmFQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id k17si15157728edx.365.2021.05.12.06.21.18; Wed, 12 May 2021 06:21:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232629AbhELM61 (ORCPT + 99 others); Wed, 12 May 2021 08:58:27 -0400 Received: from pegase2.c-s.fr ([93.17.235.10]:58869 "EHLO pegase2.c-s.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230334AbhELM6U (ORCPT ); Wed, 12 May 2021 08:58:20 -0400 Received: from localhost (mailhub3.si.c-s.fr [172.26.127.67]) by localhost (Postfix) with ESMTP id 4FgFDj4JDkz9sf2; Wed, 12 May 2021 14:57:09 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from pegase2.c-s.fr ([172.26.127.65]) by localhost (pegase2.c-s.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CKX-q6AHXoOp; Wed, 12 May 2021 14:57:09 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase2.c-s.fr (Postfix) with ESMTP id 4FgFDj3J5cz9sf1; Wed, 12 May 2021 14:57:09 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 4DE178B7F2; Wed, 12 May 2021 14:57:09 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id r9l0ajoEPVOw; Wed, 12 May 2021 14:57:09 +0200 (CEST) Received: from [192.168.4.90] (unknown [192.168.4.90]) by messagerie.si.c-s.fr (Postfix) with ESMTP id CFDFF8B7EF; Wed, 12 May 2021 14:57:08 +0200 (CEST) Subject: Re: [PATCH] powerpc: Force inlining of csum_add() To: Segher Boessenkool Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org References: <20210511105154.GJ10366@gate.crashing.org> From: Christophe Leroy Message-ID: Date: Wed, 12 May 2021 14:56:56 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 In-Reply-To: <20210511105154.GJ10366@gate.crashing.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: fr Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, Le 11/05/2021 à 12:51, Segher Boessenkool a écrit : > Hi! > > On Tue, May 11, 2021 at 06:08:06AM +0000, Christophe Leroy wrote: >> Commit 328e7e487a46 ("powerpc: force inlining of csum_partial() to >> avoid multiple csum_partial() with GCC10") inlined csum_partial(). >> >> Now that csum_partial() is inlined, GCC outlines csum_add() when >> called by csum_partial(). > >> c064fb28 : >> c064fb28: 7c 63 20 14 addc r3,r3,r4 >> c064fb2c: 7c 63 01 94 addze r3,r3 >> c064fb30: 4e 80 00 20 blr > > Could you build this with -fdump-tree-einline-all and send me the > results? Or open a GCC PR yourself :-) Ok, I'll forward it to you in a minute. > > Something seems to have decided this asm is more expensive than it is. > That isn't always avoidable -- the compiler cannot look inside asms -- > but it seems it could be improved here. > > Do you have (or can make) a self-contained testcase? I have not tried, and I fear it might be difficult, because on a kernel build with dozens of calls to csum_add(), only ip6_tunnel.o exhibits such an issue. > >> The sum with 0 is useless, should have been skipped. > > That isn't something the compiler can do anything about (not sure if you > were suggesting that); it has to be done in the user code (and it tries > to already, see below). I was not suggesting that, only that when properly inlined the sum with 0 is skipped (because we put the necessary stuff in csum_add() of course). > >> And there is even one completely unused instance of csum_add(). > > That is strange, that should never happen. It seems that several .o include unused versions of csum_add. After the final link, one remains (in addition to the used one) in vmlinux. > >> ./arch/powerpc/include/asm/checksum.h: In function '__ip6_tnl_rcv': >> ./arch/powerpc/include/asm/checksum.h:94:22: warning: inlining failed in call to 'csum_add': call is unlikely and code size would grow [-Winline] >> 94 | static inline __wsum csum_add(__wsum csum, __wsum addend) >> | ^~~~~~~~ >> ./arch/powerpc/include/asm/checksum.h:172:31: note: called from here >> 172 | sum = csum_add(sum, (__force __wsum)*(const u32 *)buff); >> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > At least we say what happened. Progress! :-) Lol. I've seen this warning for long, that's not something new I guess. > >> In the non-inlined version, the first sum with 0 was performed. >> Here it is skipped. > > That is because of how __builtin_constant_p works, most likely. As we > discussed elsewhere it is evaluated before all forms of loop unrolling. But we are not talking about loop unrolling here, are we ? It seems that the reason here is that __builtin_constant_p() is evaluated long after GCC decided to not inline that call to csum_add(). Christophe