Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp4562551pxf; Tue, 16 Mar 2021 17:33:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzDOFZDX9hJZg30mC2ACt+mPpiM0xnkzjTzlbc+Kw5h88JIAXa55MFeQ+8jfmxVlJuLpOet X-Received: by 2002:a17:906:fcb2:: with SMTP id qw18mr31679005ejb.434.1615941205084; Tue, 16 Mar 2021 17:33:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1615941205; cv=none; d=google.com; s=arc-20160816; b=ZfW2psBOnbNCAVJMwPBgw8AsArqDXM/0B1otX3BM9Q2sSco4uewzZZubJ6nroMSmRC utap2Er5H3CdvSB6SYjUndc+il0/Sb9PlxrTqkPVDS2m/F5rwtqRYNKbg4o/ZM7VcMRp oXAPB821OQUHhX8TJOETnJnhRVK6Usxfig8ntpBHPcITYuASNJ2RRAwEguJCUZYjOt2o DUtpg6XVFbu8jsfnSPlz8ld2HDzNfh2yNIsenZZ9Of2ATK0qcvEG3G0dRsrJDwWXbEqb BQMzxUUinDcJp7GcRaogvRhJoNMFc1N4hx+tXxGnOU5Ga6UAkcvd1Zv1n1bPP+mlP8OP toHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:cc:references:to :subject; bh=ubfZKaDMIHv0/DmAPg7VBaVKMbyJFWBq3l/DMt11eqw=; b=xdLjTH6T29U2TElcHaP26SuLzGwHVQuR1U/IxTpVbVoit9RMa//B99f08VJWPYML2X hQlqMxgwjwGn2DyRwzPtXNLeR3XjhVvT5z5RLLnF6N6umd0sQtuhncN60whh6V/t7KQe 1V8dTVOHEL1YdP9CScBMLfrNqVQ6N2ls7i3yj8S7oHdAxfTk6o73F32SWCbT9vDKD61S bXycmiLApbACiyDgq+13NPXiSAMOFRB0imbCYeERJ3S7Q2WHB7+kSg8tjk5GyoNBXKnc CLX14ZO9J+qWevJMd82aeM0tJk49XSlKINwTjNgIMFIc/ShZhPm6HUJTf3FJkh4kuJCD wfCg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id kf1si15427391ejc.262.2021.03.16.17.32.52; Tue, 16 Mar 2021 17:33:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229490AbhCQA1H (ORCPT + 99 others); Tue, 16 Mar 2021 20:27:07 -0400 Received: from mail.loongson.cn ([114.242.206.163]:39840 "EHLO loongson.cn" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S229460AbhCQA0b (ORCPT ); Tue, 16 Mar 2021 20:26:31 -0400 Received: from [10.130.0.135] (unknown [113.200.148.30]) by mail.loongson.cn (Coremail) with SMTP id AQAAf9Bxw+SvTFFgvB0BAA--.4349S3; Wed, 17 Mar 2021 08:26:23 +0800 (CST) Subject: Re: [PATCH v2] MIPS: Check __clang__ to avoid performance influence with GCC in csum_tcpudp_nofold() To: David Laight , "Maciej W. Rozycki" References: <1615263493-10609-1-git-send-email-yangtiezhu@loongson.cn> <913665e71fd44c5d810d006cd179725c@AcuMS.aculab.com> Cc: Thomas Bogendoerfer , "linux-mips@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Xuefeng Li From: Tiezhu Yang Message-ID: <5ee86b3b-81d2-790c-f67b-e250f60272fd@loongson.cn> Date: Wed, 17 Mar 2021 08:26:23 +0800 User-Agent: Mozilla/5.0 (X11; Linux mips64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <913665e71fd44c5d810d006cd179725c@AcuMS.aculab.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-CM-TRANSID: AQAAf9Bxw+SvTFFgvB0BAA--.4349S3 X-Coremail-Antispam: 1UD129KBjvJXoWxKFWxJr4rXw48GFW7Zw47Arb_yoWxAFy3pr W8JFWjyF4YqryxWry5Gry5XrW5trn8A3WUAFs3Jw15uFyDWF1xJrW5Gan7CrnrJr1rAF1I qFyDKr48Jw45KaUanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUvm14x267AKxVW8JVW5JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK02 1l84ACjcxK6xIIjxv20xvE14v26r1I6r4UM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r4j 6F4UM28EF7xvwVC2z280aVAFwI0_Cr0_Gr1UM28EF7xvwVC2z280aVCY1x0267AKxVW8Jr 0_Cr1UM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj 6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr 0_Gr1lF7xvr2IY64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7Mxk0xIA0c2IEe2xFo4CE bIxvr21lc2xSY4AK67AK6w1l42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr 1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE 14v26r126r1DMIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7 IYx2IY6xkF7I0E14v26r1j6r4UMIIF0xvE42xK8VAvwI8IcIk0rVWrZr1j6s0DMIIF0xvE x4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Jr0_GrUvcSsGvfC2KfnxnU UI43ZEXa7VUjg18DUUUUU== X-CM-SenderInfo: p1dqw3xlh2x3gn0dqz5rrqw2lrqou0/ Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/15/2021 08:42 PM, David Laight wrote: > From: Tiezhu Yang >> Sent: 15 March 2021 12:26 >> On 03/15/2021 04:49 AM, Maciej W. Rozycki wrote: >>> On Tue, 9 Mar 2021, Tiezhu Yang wrote: >>> >>>> diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h >>>> index 1e6c135..80eddd4 100644 >>>> --- a/arch/mips/include/asm/checksum.h >>>> +++ b/arch/mips/include/asm/checksum.h >>>> @@ -128,9 +128,13 @@ static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl) >>>> >>>> static inline __wsum csum_tcpudp_nofold(__be32 saddr, __be32 daddr, >>>> __u32 len, __u8 proto, >>>> - __wsum sum) >>>> + __wsum sum_in) >>>> { >>>> - unsigned long tmp = (__force unsigned long)sum; >>>> +#ifdef __clang__ >>>> + unsigned long sum = (__force unsigned long)sum_in; >>>> +#else >>>> + __wsum sum = sum_in; >>>> +#endif >>> This looks much better to me, but I'd keep the variable names unchanged >>> as `sum_in' isn't used beyond the initial assignment anyway (you'll have >>> to update the references with asm operands accordingly of course). >>> >>> Have you verified that code produced by GCC remains the same with your >>> change in place as it used to be up to commit 198688edbf77? I can see no >>> such information in the commit description whether here or in the said >>> commit. >>> >>> Maciej >> Hi Maciej, >> >> Thanks for your reply. >> >> gcc --version >> gcc (Debian 10.2.1-6) 10.2.1 20210110 >> >> net/ipv4/tcp_ipv4.c >> tcp_v4_send_reset() >> csum_tcpudp_nofold() >> >> objdump -d vmlinux > vmlinux.dump >> >> (1) Before commit 198688edbf77 >> ("MIPS: Fix inline asm input/output type mismatch in checksum.h used >> with Clang"): >> >> ffffffff80aa835c: 00004025 move a4,zero >> ffffffff80aa8360: 92020012 lbu v0,18(s0) >> ffffffff80aa8364: de140030 ld s4,48(s0) >> ffffffff80aa8368: 0064182d daddu v1,v1,a0 >> ffffffff80aa836c: 304200ff andi v0,v0,0xff >> ffffffff80aa8370: 9c64000c lwu a0,12(v1) >> ffffffff80aa8374: 9c660010 lwu a2,16(v1) >> ffffffff80aa8378: afa70038 sw a3,56(sp) >> ffffffff80aa837c: 24071a00 li a3,6656 >> ffffffff80aa8380: 0086202d daddu a0,a0,a2 >> ffffffff80aa8384: 0087202d daddu a0,a0,a3 >> ffffffff80aa8388: 0088202d daddu a0,a0,a4 >> ffffffff80aa838c: 0004083c dsll32 at,a0,0x0 >> ffffffff80aa8390: 0081202d daddu a0,a0,at >> ffffffff80aa8394: 0081082b sltu at,a0,at >> ffffffff80aa8398: 0004203f dsra32 a0,a0,0x0 >> ffffffff80aa839c: 00812021 addu a0,a0,at >> >> (2) After commit 198688edbf77 >> ("MIPS: Fix inline asm input/output type mismatch in checksum.h used >> with Clang"): >> >> ffffffff80aa836c: 00004025 move a4,zero >> ffffffff80aa8370: 92040012 lbu a0,18(s0) >> ffffffff80aa8374: de140030 ld s4,48(s0) >> ffffffff80aa8378: 0062182d daddu v1,v1,v0 >> ffffffff80aa837c: 308400ff andi a0,a0,0xff >> ffffffff80aa8380: 9c62000c lwu v0,12(v1) >> ffffffff80aa8384: 9c660010 lwu a2,16(v1) >> ffffffff80aa8388: afa70038 sw a3,56(sp) >> ffffffff80aa838c: 24071a00 li a3,6656 >> ffffffff80aa8390: 0046102d daddu v0,v0,a2 >> ffffffff80aa8394: 0047102d daddu v0,v0,a3 >> ffffffff80aa8398: 0048102d daddu v0,v0,a4 >> ffffffff80aa839c: 0002083c dsll32 at,v0,0x0 >> ffffffff80aa83a0: 0041102d daddu v0,v0,at >> ffffffff80aa83a4: 0041082b sltu at,v0,at >> ffffffff80aa83a8: 0002103f dsra32 v0,v0,0x0 >> ffffffff80aa83ac: 00411021 addu v0,v0,at >> >> (3) With this patch: >> >> ffffffff80aa835c: 00004025 move a4,zero >> ffffffff80aa8360: 92020012 lbu v0,18(s0) >> ffffffff80aa8364: de140030 ld s4,48(s0) >> ffffffff80aa8368: 0064182d daddu v1,v1,a0 >> ffffffff80aa836c: 304200ff andi v0,v0,0xff >> ffffffff80aa8370: 9c64000c lwu a0,12(v1) >> ffffffff80aa8374: 9c660010 lwu a2,16(v1) >> ffffffff80aa8378: afa70038 sw a3,56(sp) >> ffffffff80aa837c: 24071a00 li a3,6656 >> ffffffff80aa8380: 0086202d daddu a0,a0,a2 >> ffffffff80aa8384: 0087202d daddu a0,a0,a3 >> ffffffff80aa8388: 0088202d daddu a0,a0,a4 >> ffffffff80aa838c: 0004083c dsll32 at,a0,0x0 >> ffffffff80aa8390: 0081202d daddu a0,a0,at >> ffffffff80aa8394: 0081082b sltu at,a0,at >> ffffffff80aa8398: 0004203f dsra32 a0,a0,0x0 >> ffffffff80aa839c: 00812021 addu a0,a0,at >> >> (4) With the following changes based on commit 198688edbf77 >> ("MIPS: Fix inline asm input/output type mismatch in checksum.h used >> with Clang"): >> >> diff --git a/arch/mips/include/asm/checksum.h >> b/arch/mips/include/asm/checksum.h >> index 1e6c135..e1f80407 100644 >> --- a/arch/mips/include/asm/checksum.h >> +++ b/arch/mips/include/asm/checksum.h >> @@ -130,7 +130,11 @@ static inline __wsum csum_tcpudp_nofold(__be32 >> saddr, __be32 daddr, >> __u32 len, __u8 proto, >> __wsum sum) >> { >> +#ifdef __clang__ >> unsigned long tmp = (__force unsigned long)sum; >> +#else >> + __wsum tmp = sum; >> +#endif >> >> __asm__( >> " .set push # csum_tcpudp_nofold\n" >> >> ffffffff80aa835c: 00004025 move a4,zero >> ffffffff80aa8360: 92020012 lbu v0,18(s0) >> ffffffff80aa8364: de140030 ld s4,48(s0) >> ffffffff80aa8368: 0064182d daddu v1,v1,a0 >> ffffffff80aa836c: 304200ff andi v0,v0,0xff >> ffffffff80aa8370: 9c64000c lwu a0,12(v1) >> ffffffff80aa8374: 9c660010 lwu a2,16(v1) >> ffffffff80aa8378: afa70038 sw a3,56(sp) >> ffffffff80aa837c: 24071a00 li a3,6656 >> ffffffff80aa8380: 0086202d daddu a0,a0,a2 >> ffffffff80aa8384: 0087202d daddu a0,a0,a3 >> ffffffff80aa8388: 0088202d daddu a0,a0,a4 >> ffffffff80aa838c: 0004083c dsll32 at,a0,0x0 >> ffffffff80aa8390: 0081202d daddu a0,a0,at >> ffffffff80aa8394: 0081082b sltu at,a0,at >> ffffffff80aa8398: 0004203f dsra32 a0,a0,0x0 >> ffffffff80aa839c: 00812021 addu a0,a0,at >> >> The code produced by GCC remains the same between (1), (3) and (4), >> the last changes looks like better (with less changes based on commit >> 198688edbf77), so I will send v3 later. > Aren't those all the same - apart from register selection. > Not that I grok the mips opcodes. > But that code has horridness on its side. > > The only obvious difference is that something else changes the > code offset from xxxx5c to xxxx6c. > > David > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > Registration No: 1397386 (Wales) Hi David, Yes, it seems no much obvious differences. Let me wait for other feedback. Hi Thomas and Maciej, Is this patch necessary? If no, we can ignore it. If yes, I will send v3 with the above (4) changes. Thanks, Tiezhu