Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp553944pxf; Wed, 17 Mar 2021 10:21:09 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwnwHpF4e24pakxD5lIBhEJxKpZeogvunqjzfloJXRktphP1e0z/nh7PNPuw+EDuq0gUdNT X-Received: by 2002:a17:906:4a50:: with SMTP id a16mr37221815ejv.256.1616001669526; Wed, 17 Mar 2021 10:21:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1616001669; cv=none; d=google.com; s=arc-20160816; b=DBIFhOtT2gCMSypJpidco9B4U+9ohGa/xiIH34FapEZhMTqlvlH+Ko+U2zD5jQinGb JI+K+dT2RUpE+wicqAmg7IvtejbZz9mcW+etNvXzsvj74TSIkzzcBVK6rnOL/33sagDu gHlczaJqg3+LKpG/sSFIxy29MCIE17t1ROvErOTfTNdt3DgV6aeQ0a8bA9k1BHMdJtPp 25Bybdechh7/zUVLaTxIroIUOCAjGJUYPIlNYRixbgaAJRweG9OfO5auZjdd4XdbAXIg ZAyiF9c+OUAuPGEM+9ugbKo7QeYFStw46rAzr/nSaDdL+QOlNTodH6Ff3KnCM3SlT+jY PJ+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :mime-version:accept-language:in-reply-to:references:message-id:date :thread-index:thread-topic:subject:cc:to:from; bh=PNuh9wgfGnUmLpwMoaPGmzeRVwRHbOgwin1/q2ovIW4=; b=F/Aan9crzz3MpAuyy20iNKB6M3YbM/dJEFi+OwbU0ge9jvtJQVyAaO0fIfjHxr9Ee5 DJMTS+j/Qodk38kLfUAzC8CT3zgzhqXxhU6kXrh4/5pRxT3ckG0gB/UB2aU1YZaF5RmU zw8leAbNgV402DnJnFoTxNgrvrh349hWtYWKYgsCdGvVSQ+ApS7cPdYY0gKgqVHHYS+Q i9L3XYqVUIBQCX4rXElf5aKPm7bYEh+OdeD1y1fTFRWZBSnigxwEBXGo8qYIZvI2wve0 kx4FrLZ/B3c6cCQcqxtwBIRS8C2SkgqnY42mQ4wy2hZ0UkZBOZ/W7Amfk2zI9vdN+Bgt Svvg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u26si16846995ejx.30.2021.03.17.10.20.46; Wed, 17 Mar 2021 10:21:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aculab.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232364AbhCQQJm convert rfc822-to-8bit (ORCPT + 99 others); Wed, 17 Mar 2021 12:09:42 -0400 Received: from eu-smtp-delivery-151.mimecast.com ([185.58.86.151]:40780 "EHLO eu-smtp-delivery-151.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232375AbhCQQJV (ORCPT ); Wed, 17 Mar 2021 12:09:21 -0400 Received: from AcuMS.aculab.com (156.67.243.126 [156.67.243.126]) (Using TLS) by relay.mimecast.com with ESMTP id uk-mta-141-CgiPIEZCN1CWsvSmGNeuiQ-1; Wed, 17 Mar 2021 16:09:17 +0000 X-MC-Unique: CgiPIEZCN1CWsvSmGNeuiQ-1 Received: from AcuMS.Aculab.com (fd9f:af1c:a25b:0:994c:f5c2:35d6:9b65) by AcuMS.aculab.com (fd9f:af1c:a25b:0:994c:f5c2:35d6:9b65) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Wed, 17 Mar 2021 16:09:17 +0000 Received: from AcuMS.Aculab.com ([fe80::994c:f5c2:35d6:9b65]) by AcuMS.aculab.com ([fe80::994c:f5c2:35d6:9b65%12]) with mapi id 15.00.1497.012; Wed, 17 Mar 2021 16:09:17 +0000 From: David Laight To: "'Maciej W. Rozycki'" , Tiezhu Yang CC: Thomas Bogendoerfer , "linux-mips@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Xuefeng Li Subject: RE: [PATCH v2] MIPS: Check __clang__ to avoid performance influence with GCC in csum_tcpudp_nofold() Thread-Topic: [PATCH v2] MIPS: Check __clang__ to avoid performance influence with GCC in csum_tcpudp_nofold() Thread-Index: AQHXGZZkp8sV3w87mU2VG3u7laRZUqqE/LVwgANWf5KAAAUfwA== Date: Wed, 17 Mar 2021 16:09:17 +0000 Message-ID: <6e7bc85a3f92419f89117fc1381511be@AcuMS.aculab.com> References: <1615263493-10609-1-git-send-email-yangtiezhu@loongson.cn> <913665e71fd44c5d810d006cd179725c@AcuMS.aculab.com> <5ee86b3b-81d2-790c-f67b-e250f60272fd@loongson.cn> In-Reply-To: Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.107] MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=C51A453 smtp.mailfrom=david.laight@aculab.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Maciej W. Rozycki > Sent: 17 March 2021 15:36 .. > > > Not that I grok the mips opcodes. > > > But that code has horridness on its side. > > It's a 32-bit one's-complement addition. The use of 64-bit operations > reduces the number of calculations as any 32-bit carries accumulate in the > high 32-bit word allowing one instruction to be saved total compared to > the 32-bit variant. Nothing particularly unusual for me here; I've seen > worse stuff with x86. The 'problem' is that mips doesn't have a carry flag. So the 64-bit maths is 'tricky'. It may well be that a loop based on: do { val = *ptr++; sum += val; carry += sum < val; } while (ptr != limit) will generate much better code. I think there is a 'setlt' instruction for the compare. It certainly would on the nios (which is mips-like). That is (probably) 6 instructions for 4 bytes. I suspect there may be a data stall after the memory read. So an interleaved unroll would remove that stall. That would be 10 clocks for 8 bytes. The x86-64 code is 'interesting'. It has repeated 'add carry' instructions. On Intel cpus prior to (at least) Haswell they take two clocks each. So the code is no faster than adding 32bit values to a 64bit sum. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)