Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp1762175pxj; Wed, 19 May 2021 13:19:56 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzrcISSywdduPg50F4oqEtuGXPSS6ZXZ1pr+s/JxxKuJgCFTao42KHOPgF4jKxTRgcBIt6+ X-Received: by 2002:a05:6e02:ca7:: with SMTP id 7mr841365ilg.210.1621455596805; Wed, 19 May 2021 13:19:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1621455596; cv=none; d=google.com; s=arc-20160816; b=PV+CGPzJUlLtbib5SEr1Pxk0/pi/XY5ygJ+PEErxLGakaAA+FnRYVtbFW04/BLxUWx NgZ0IAmHsul0nMN1qPjlzKNeJbHpxr9InBvGn7VauibLq5joOpZrmIROsKUevUme9dTb 74EaLU4Oc0jOYb3xHa8XZNONxeKKeozZn1oTFymiEkS7Bv0Ig2kQfCYopkQhD7MIjKm0 tu/cl4IR0kSfppB3ozCXqz7iHwEmh0Zt6ARgzFgk5oSgN9yXMPcORujz6rBfqOZ8anlS P/cpAuRwWtSeKfODNq0pV3F2eQzvbmBW8EQQloAwN54DwGcorK/zjF0CjL3omMMEr8jr /uCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=46gq6Y9W8CzFkwjoy9hWLKn2qwp7R+ob1Xtw/3ebtAo=; b=hw931IUinT/C9rC+nen8jINJ7FyQNUpsBWPrAlftgjZ80586VzCNjjKneSc8ryJiWu c34OikwpAR0GdJHxTbqd/KzVm2VBPOng0vX5/kKlD49JoJRDoyD0sc+S19B5XhHwE/hc 7U2J38HpkQ2ySLqHE7fS7UJlEfcWg1RS1LLjUoqu8CRry1/t/W/C697VblHhn2YTLJkh GWuMuRCjONs2VzMSbgcd7/sDTn/rX67J0BsNihqXlqJvFL3rxhP2ZUCRJTOOIivSNAXy v5pVyZksJOex4+TiFVcEgxPn8H1WPHYEG5YVriH3LxkqpifuqcrLBkxpcaqdhpIAC2ct fh4Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a2si673086ilh.91.2021.05.19.13.19.44; Wed, 19 May 2021 13:19:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230175AbhESSOV (ORCPT + 99 others); Wed, 19 May 2021 14:14:21 -0400 Received: from mail.kernel.org ([198.145.29.99]:34900 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229575AbhESSOV (ORCPT ); Wed, 19 May 2021 14:14:21 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 91AD4611BF; Wed, 19 May 2021 18:12:59 +0000 (UTC) Date: Wed, 19 May 2021 19:12:57 +0100 From: Catalin Marinas To: Peter Collingbourne Cc: Evgenii Stepanov , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Will Deacon , Steven Price , kasan-dev , Linux ARM , Linux Kernel Mailing List Subject: Re: [PATCH v3] kasan: speed up mte_set_mem_tag_range Message-ID: <20210519181225.GF21619@arm.com> References: <20210517235546.3038875-1-eugenis@google.com> <20210518174439.GA28491@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 18, 2021 at 11:11:52AM -0700, Peter Collingbourne wrote: > On Tue, May 18, 2021 at 10:44 AM Catalin Marinas > wrote: > > If we want to get the best performance out of this, we should look at > > the memset implementation and do something similar. In principle it's > > not that far from a memzero, though depending on the microarchitecture > > it may behave slightly differently. > > For Scudo I compared our storeTags implementation linked above against > __mtag_tag_zero_region from the arm-optimized-routines repository > (which I think is basically an improved version of that memset > implementation rewritten to use STG and DC GZVA), and our > implementation performed better on the hardware that we have access > to. That's the advantage of having hardware early ;). > > Anyway, before that I wonder if we wrote all this in C + inline asm > > (three while loops or maybe two and some goto), what's the performance > > difference? It has the advantage of being easier to maintain even if we > > used some C macros to generate gva/gzva variants. > > I'm not sure I agree that it will be easier to maintain. Due to the > number of "unusual" instructions required here it seems more readable > to have the code in pure assembly than to require readers to switch > contexts between C and asm. If we did move it to inline asm then I > think it should basically be a large blob of asm like the Scudo code > that I linked. I was definitely not thinking of a big asm block, that's even less readable than separate .S file. It's more like adding dedicated macros for single STG or DC GVA uses and using them in while loops. Anyway, let's see a better commented .S implementation first. Given that tagging is very sensitive to the performance of this function, we'd probably benefit from a (few percent I suspect) perf improvement with the hand-coded assembly. -- Catalin