Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp7228105imu; Mon, 3 Dec 2018 09:33:10 -0800 (PST) X-Google-Smtp-Source: AFSGD/UsqMwt5+SVN7Vk75rtQ+nQvIgEnKEUoyzdWBtF6HGHUCZ6ZfZh8sqE6n+GpSFFUKSxnkW/ X-Received: by 2002:a63:4187:: with SMTP id o129mr12940503pga.370.1543858390281; Mon, 03 Dec 2018 09:33:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543858390; cv=none; d=google.com; s=arc-20160816; b=EhhyDx0hRTZ2uN3HXzACd/Ik1xe5sNLr8wg3zPTLKcGh5Yw/E1P5bjOdvZOdMOstQa 9IOobdvGBuVkR8fTb2KmwqbvGjLpfIyv9NYbtQnKJHlOk06NnnQyQa2MjE60CDBgoF3g ibgH7GyOUEK6Uk+OqyhjiyayJpbYizoxH4DROur+0/k+zFAOuQ0SodH2ELM1wCj/VR8N Rxb5jgR+5ISuPu2Z91V+rWPzO2g9+78AEzNK7DoXrThntaNkuroseZcf4EAet6QmON4w YXSr1QVy2E8sJdh/qJTLZxUHaSVkIly3SaBR5DpE+Cvg7GL/1QoRi3Wgro9nEAtrlg02 f5yA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :content-language:accept-language:references:message-id:date :thread-index:thread-topic:subject:cc:to:from:dkim-signature; bh=cwjNOLF2O1SiGXoMRRMvEo5GyTRriUa6wv6kjPUw7ys=; b=O4HEbpX83dUPNvarfuF8uuvaAsWuY/rC916I6egZEdvEiQMdXX5ztyD9yxYzYCYNrR +hsHGeCzFImV+l4CwfJZQaJ8L1T+7cyCi3JDG8AebCEp5SJ7uNPGqISqH1MsKLLADMWz VeDhc/2Tg92O+wHgSMbNXTfkGZQ3pMkhlM9Pzud5lEKLb9JBVcezckw+zmAYoIrSpXg5 WbyD5jAhmeD2k2DjnjcmNUbYIchQiE45fADk9tXKkclrJDexa0NUypdorQDAvZBdwtXh DRv5hOcy5YHXwnybfhLvcUZTtv09ltGt6+d8hl9cioEo4hZXGrt3SDicvbkAW1fETqu7 /AKg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@synopsys.com header.s=mail header.b="XFD8dL/d"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=synopsys.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d14si12090412pgi.158.2018.12.03.09.32.54; Mon, 03 Dec 2018 09:33:10 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@synopsys.com header.s=mail header.b="XFD8dL/d"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=synopsys.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726869AbeLCRbe (ORCPT + 99 others); Mon, 3 Dec 2018 12:31:34 -0500 Received: from smtprelay2.synopsys.com ([198.182.60.111]:39494 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726564AbeLCRbd (ORCPT ); Mon, 3 Dec 2018 12:31:33 -0500 Received: from mailhost.synopsys.com (mailhost2.synopsys.com [10.13.184.66]) by smtprelay.synopsys.com (Postfix) with ESMTP id A194010C0752; Mon, 3 Dec 2018 09:31:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=synopsys.com; s=mail; t=1543858287; bh=cwjNOLF2O1SiGXoMRRMvEo5GyTRriUa6wv6kjPUw7ys=; h=From:To:CC:Subject:Date:References:From; b=XFD8dL/daEnoxR91eJ/FUjdhmnQXgjkMkuSZs3oQFHRxMn3Cl6hufffVdnkQByLDc uiYvW0enVNX9xTx+3oa7JhDIHD7Z8/N+rNlnnQAIUGhRlXuitQ+lvKwJFOjMi4KCI2 xvK0PNhKfg0KOB/soK61PpmJ50USAir11qpwlBR55U/7B2N2ywTzIv4baovKZO2kZb XLjYTkwn9H3LsMEO8tIwz5tOv7W8Y2sGHkCOCR12+z078+LExGfRp5H1cqK9PUQp+d 4yunqtod72QyQtK9UzEsqnuxzIGRN8IbCMkodbiCz+zuLeVOQ7oCjJ12I69lbwhUUo Lz0Cy/VC9A+YQ== Received: from US01WXQAHTC1.internal.synopsys.com (us01wxqahtc1.internal.synopsys.com [10.12.238.230]) by mailhost.synopsys.com (Postfix) with ESMTP id F0AED3F3D; Mon, 3 Dec 2018 09:31:25 -0800 (PST) Received: from US01WEMBX2.internal.synopsys.com ([fe80::e4b6:5520:9c0d:250b]) by US01WXQAHTC1.internal.synopsys.com ([::1]) with mapi id 14.03.0415.000; Mon, 3 Dec 2018 09:31:25 -0800 From: Vineet Gupta To: David Laight , 'Arnd Bergmann' , "jose.abreu@synopsys.com" CC: "open list:SYNOPSYS ARC ARCHITECTURE" , Linux Kernel Mailing List , "alexey.brodkin@synopsys.com" , Joao Pinto , "Vitor Soares" Subject: Re: [PATCH v2] ARC: io.h: Implement reads{x}()/writes{x}() Thread-Topic: [PATCH v2] ARC: io.h: Implement reads{x}()/writes{x}() Thread-Index: AQHUh+/oo0RKKbidaU63fw0ncGEK7A== Date: Mon, 3 Dec 2018 17:31:25 +0000 Message-ID: References: <19fb2e394afcb073bbc109e432417fbbc03323f6.1543499759.git.joabreu@synopsys.com> <89122bd8-bca2-2ae1-0dd0-160abbebcace@synopsys.com> <57437493-31bb-eced-032c-1f54470b030e@synopsys.com> <3afe0e1bbf2d42d3bb178ec789553c28@AcuMS.aculab.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.144.199.106] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 12/3/18 2:10 AM, David Laight wrote:=0A= > From: Vineet Gupta=0A= > ...=0A= >>> It also seems to have used a different type of loop to the=0A= >>> other example, probably less efficient.=0A= >>> (Not that I'm an expert on ARC opcodes.)=0A= >> The difference is due to ISA and ensuing ARC gcc backends. ARCompact bas= ed cores=0A= >> don't support unaligned access and the loop there was ZOL (Zero delay lo= op). In=0A= >> ARCv2 based cores, the gcc backend has been tweaked to generate fewer ZO= Ls hence=0A= >> you see the more canonical tst and branch style loop.=0A= > Is this another case of the hardware implementing 'hardware' loop=0A= > instructions that execute slower than ones made of simple instructions?= =0A= =0A= Not really. ZOL allow for hardware loops with no instruction/cycle overhead= in=0A= general. However as micro-arches get more complicated there are newer "gizm= os"=0A= added to the machinery which sometimes make it harder for the compliers to= =0A= optimize for all the cases. ARCv2 ISA has a new DBNZ instruction (similar t= o x86=0A= you refer below) to implement loops and that is preferred over the ZOL.=0A= =0A= > The worst example has to be the x86 'loop' (dec cx and jump nz)=0A= > instruction which is microcoded on intel cpus.=0A= > That makes it very difficult to use the new addx instruction to=0A= > get two dependency chains through a loop.=0A= =0A=