Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp4351557pxb; Tue, 10 Nov 2020 14:20:28 -0800 (PST) X-Google-Smtp-Source: ABdhPJzzGl0yfKdCWo6nWYd3ppgW6/tfTVAED8QwDT0vfuNoWv3kECkzwQQj2uk9eF1a65HX1dWE X-Received: by 2002:a17:906:241b:: with SMTP id z27mr21286050eja.418.1605046828485; Tue, 10 Nov 2020 14:20:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605046828; cv=none; d=google.com; s=arc-20160816; b=Rdq9CqSY3e1UKZKBWuxcEmaKeRMesg6gHZp9QmFClGTNxUvxTyCwWunJJSuo9lzhae 3jVuhaD3iqQbbgy5BInCdyrh9nUrb3lrp6r3mcZ11Rc/ud5hiHZeST58xWGz+ibblRHR 9rz8gagUReK4JvOXwgC81WSxidWbNN+AhBnneDk4WbA+oALXi5xxdJOCdOfR0d1q5V5D w5Yezl+CZJY3PBWWnpIHsaqWn8Zne86mmVF1fUBTD83JVYnqUZ0vSbd3EvSxy7TxiP2A 78hLHboQukRyoMBJVWr6Pj6mlaWwDpOqc35MSporu/yZpdS7Y9kzt6G/xGpYChrZrgd8 4oHA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:date:from:sender:dkim-signature; bh=uIOzI8x2X1SZGvfM2NfEeGZjOdYhmoI7amgUKQgpunY=; b=ARX5B5NwZlCm8BNlgJG3TmyBTMZ4xqoCLb4JrV47VErmyUGnpMq5N421CLF1VUA0Bm 2WlZlYcJ3GB9hsE76KABpdKvwK+6odTXl4Y+NvccJinYaOrcAZl/GQLsCwTwmYCFFzrn 7URyNDSLr5PhesN4Pw5VsXRTYyZkRJ3TadgaYyPHiIt4BOQnviHDVkvYtkr2qErf2+n5 64cG/cqY5Ih1eg8BAV/RBo8xoqZKg5IqindlA/QllRCuKw0ZLE3O0CzhUGS9RynnLVNT cXm59l5uj6qBl491WXWfPbxYmQcqPIbtxp0J7HcG09XgiVxa10jDcubfVfL/RblRrYYx 3vnw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="K/LL2lfA"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o10si64737edf.226.2020.11.10.14.20.04; Tue, 10 Nov 2020 14:20:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="K/LL2lfA"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732158AbgKJWPP (ORCPT + 99 others); Tue, 10 Nov 2020 17:15:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49012 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731805AbgKJWPP (ORCPT ); Tue, 10 Nov 2020 17:15:15 -0500 Received: from mail-qk1-x741.google.com (mail-qk1-x741.google.com [IPv6:2607:f8b0:4864:20::741]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 049DDC0613D1 for ; Tue, 10 Nov 2020 14:15:15 -0800 (PST) Received: by mail-qk1-x741.google.com with SMTP id y197so13176044qkb.7 for ; Tue, 10 Nov 2020 14:15:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:date:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=uIOzI8x2X1SZGvfM2NfEeGZjOdYhmoI7amgUKQgpunY=; b=K/LL2lfALMmOHYaFluLeF7ajWtNG/IZxN6cpjSQUl3oKYuF63EVb6hnPmRksiKpikc zDvm4gP8LZe6+aH1RbfsNB6+sxlL+B0R+KvW5rfZKSYa98ZYNCzi0sbUfyQAWvSrMu2t Q9H9n35b1vXgQvF1teMaaJLsdL1hOytCQujaFVLB+jZGuBaKBipBV9OtcFmzp6reMSHv U/le7QaCI6uxNFLwbK4l2INR/SQriinbei2NJXAyNLba8+eiI/zIzU5MFUfGonx1QLyL hkS6uhZxrnn3xzXvuYIdCMwqyj4vYDeg/ozt+G+2MKMR8lNcwYD6fVo7ZxUxwTN0Vo+1 1ujw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:date:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=uIOzI8x2X1SZGvfM2NfEeGZjOdYhmoI7amgUKQgpunY=; b=I1h7WESmj1koaBtReLkbItuBmDGd60/XeWExJ6lenOt6+LtA198zVVvOmPzmuGHJby EjM1e7Ztn0cyQuEKEKS7qsZjfOAbCDr6E4XoZAq46bVLXh+dZ6wl0k7BICPUB9y8FakH eAfi2zDu5uPzjE0++RpPJHqO0AO/SQRQNJ8ECleCkoaED28Wk1jt9L3cfeElocKqclrC 58YnH9zzXqqIaz9FfRg573ybpI/y7ooTkEwWKUW7XLRfIXmAo6mZRJGfYK8ciorAH+1d tfuLmWJJyRDvXBGW50/NJxV99HTf0LT2aD46Z0Bx5wbg7AImHDQK8ftndUw2/9K0Oglv ogqA== X-Gm-Message-State: AOAM530MRFyPeHCfzNV1mgXYwx+kW8g7OJH9AwGPk0kcBPRgv7bJ0sC/ uFVpSxvRFdT/pu/87PgwM4I= X-Received: by 2002:a37:7ac3:: with SMTP id v186mr21828124qkc.451.1605046514064; Tue, 10 Nov 2020 14:15:14 -0800 (PST) Received: from rani.riverdale.lan ([2001:470:1f07:5f3::b55f]) by smtp.gmail.com with ESMTPSA id c76sm189612qkb.20.2020.11.10.14.15.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Nov 2020 14:15:13 -0800 (PST) Sender: Arvind Sankar From: Arvind Sankar X-Google-Original-From: Arvind Sankar Date: Tue, 10 Nov 2020 17:15:11 -0500 To: Nick Desaulniers Cc: Adrian Ratiu , Nathan Chancellor , Arnd Bergmann , Linux ARM , clang-built-linux , Russell King , LKML , Collabora Kernel ML , Ard Biesheuvel Subject: Re: [PATCH 2/2] arm: lib: xor-neon: disable clang vectorization Message-ID: <20201110221511.GA1373528@rani.riverdale.lan> References: <20201106051436.2384842-1-adrian.ratiu@collabora.com> <20201106051436.2384842-3-adrian.ratiu@collabora.com> <20201106101419.GB3811063@ubuntu-m3-large-x86> <87wnyyvh56.fsf@collabora.com> <871rh2i9xg.fsf@iwork.i-did-not-set--mail-host-address--so-tickle-me> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 10, 2020 at 01:41:17PM -0800, Nick Desaulniers wrote: > On Mon, Nov 9, 2020 at 11:51 AM Adrian Ratiu wrote: > > > > On Fri, 06 Nov 2020, Nick Desaulniers > > wrote: > > > +#pragma clang loop vectorize(enable) > > > do { > > > p1[0] ^= p2[0] ^ p3[0] ^ p4[0] ^ p5[0]; p1[1] ^= > > > p2[1] ^ p3[1] ^ p4[1] ^ p5[1]; > > > ``` seems to generate the vectorized code. > > > > > > Why don't we find a way to make those pragma's more toolchain > > > portable, rather than open coding them like I have above rather > > > than this series? > > > > Hi again Nick, > > > > How did you verify the above pragmas generate correct vectorized > > code? Have you tested this specific use case? > > I read the disassembly before and after my suggested use of pragmas; > look for vld/vstr. You can also add -Rpass-missed=loop-vectorize to > CFLAGS_xor-neon.o in arch/arm/lib/Makefile and rebuild > arch/arm/lib/xor-neon.o with CONFIG_BTRFS enabled. > https://godbolt.org/z/1oo9M6 With the __restrict__ keywords added, clang seems to vectorize the loop, but still reports that vectorization wasn't beneficial -- any idea what's going on?