Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp4361246pxb; Tue, 10 Nov 2020 14:39:13 -0800 (PST) X-Google-Smtp-Source: ABdhPJwsm3otOmmUvJ8aPZqh8gdV4A+7gof9B5hcN3H3BRkQRzjfs3kteSnpcDLZ/Z5oknE3nX+r X-Received: by 2002:a17:906:903:: with SMTP id i3mr6362629ejd.218.1605047953263; Tue, 10 Nov 2020 14:39:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605047953; cv=none; d=google.com; s=arc-20160816; b=m6jvv84G4BNoZFrIyHhWzeY8SHzLT+hJW+lYWDMakJ9y8LteF8VONT7eA/Dgii16OO 43awLHI5+ozyZNXqjjgdhc8mHz3kPTaXwjt0BDBoHAexpMCHh7e7kppE5Jk3Z2w44/ka jSbE6OOHmqHcBF0x2zkhYU5OshDKoDfbiQBfgoLnv4vDm+rs2/iTBZ261cTmI+vv/pOB 0287aikMAGV1rA1dverFI3xcU6ckUvp7EeG9QMW1mquqSJ6ENcXu2TfNIEJe++flRjVz hXMOw5HdVOK2jeAVTgCdYYf2++77qSCyr+v8E+bOIVAcGEKfFaGOGwNcmeI4r+XvlGAw C1XA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=6e9bQzOYpSKr6lgwRrQrnk79+N3XZZYvyW04ZYowHus=; b=gPgIkDSAXWLFS6CmYYxsKwvD3PZG1IEl1ILK5axtZeJS8oA7UTrzVPAwPFDmzYA0mm 4mbLTt9eHPPwlAGjPSCIGpAqIELSaVyf1Kh4XtFQhZfcokXif8LuHfJ9FsTK69AZVw41 B2nM9ujPpCSUYR/2+d1hdGTJiEmVDkDLPGj/TAMnJe3t7MDHz+pXhqCbp3G1VJcG5IT4 vVMQNeVyKhdUh8UUdrViOSC4zv5QGMC4wqJyimPaqBwyn2OY26WOVHXIMxkEBq0xzrUv DYymbHdaUx2H9hUYGSorCMw4rHIH7mVjMZErPWsDfIlrEK/bnGSV0GqbYju80un1V6oA MXUQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=P81zf05N; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v8si115991edi.25.2020.11.10.14.38.46; Tue, 10 Nov 2020 14:39:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=P81zf05N; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731657AbgKJWgk (ORCPT + 99 others); Tue, 10 Nov 2020 17:36:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52360 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726688AbgKJWgj (ORCPT ); Tue, 10 Nov 2020 17:36:39 -0500 Received: from mail-pl1-x641.google.com (mail-pl1-x641.google.com [IPv6:2607:f8b0:4864:20::641]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AEBBFC0613D1 for ; Tue, 10 Nov 2020 14:36:39 -0800 (PST) Received: by mail-pl1-x641.google.com with SMTP id y22so2204515plr.6 for ; Tue, 10 Nov 2020 14:36:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=6e9bQzOYpSKr6lgwRrQrnk79+N3XZZYvyW04ZYowHus=; b=P81zf05Nc+7lwqt2tliJpfEWjkF5gyZggbG+njYiLfUd1k+bDQkC4VhOtbzb9eV2qh loUqW1fYIisWI7lWYy5OY2x89/lzkqmoYjCwslCEYZPi81QWEdFjJNcVI/YVVxJSj5cR zMhw4dq2ciWuZ2iKe8OShvsifhaMKcfQEPtUDSSULv7D3ikDAjyzwCIsrve6OtDGdlIh Dg7wIVWKbEWcA+hZt9j78cUMv1mTd3Ob7LxmPetSelYW1cSQY4arDTVXQ+KqWLiKq97I d4EDmcGw/xwcn1+stTWTVPah9NgB+ySCgVVHDREnAcByxqgID2r+EGOIlzuxI8x+fFhj 1CDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=6e9bQzOYpSKr6lgwRrQrnk79+N3XZZYvyW04ZYowHus=; b=tBHfxTUybdmsG27EEsfsnuDpnkgnp0B8bEI248X3t8sqeqOXl7P9zmQ1j8kwBWAuyZ L6p0I1WXoa6Fmh+kG6DW2pjRs1PxL4X1LaNVNKOpVRWmNXjGuBbSnod1Rg+nvbyjV5v5 QfbJgLz5mqGmGOUVcJkNEMyHxMx7WHrHotbE6I7j2l2tX68nKwpkCxv0SbWZGnEaU9YT 1GfvOoBYcRPx4xrBgLpGGpEAMWaXJFkFvQ6vyTwL7tIkm4xayKY8ahRdrXzUbE1TEp6M 4lnJ9qnOkVZfK6Y1Np0rdW+w5KSVxpwkAomd/+YO6RjmsGws344sT6wvDDrkVCktZ0+u 3dsA== X-Gm-Message-State: AOAM531wotldOvjnt1chHcRBH1DncmOrWpJdZ5OmnW1+ebiFIb5HKvIs wqVDA2ZBUYFBDHR6ezIM+Ie6/vQt/pqHJpXs++p08A== X-Received: by 2002:a17:902:760c:b029:d6:efa5:4cdd with SMTP id k12-20020a170902760cb02900d6efa54cddmr18256727pll.56.1605047798846; Tue, 10 Nov 2020 14:36:38 -0800 (PST) MIME-Version: 1.0 References: <20201106051436.2384842-1-adrian.ratiu@collabora.com> <20201106051436.2384842-3-adrian.ratiu@collabora.com> <20201106101419.GB3811063@ubuntu-m3-large-x86> <87wnyyvh56.fsf@collabora.com> <871rh2i9xg.fsf@iwork.i-did-not-set--mail-host-address--so-tickle-me> <20201110221511.GA1373528@rani.riverdale.lan> In-Reply-To: <20201110221511.GA1373528@rani.riverdale.lan> From: Nick Desaulniers Date: Tue, 10 Nov 2020 14:36:27 -0800 Message-ID: Subject: Re: [PATCH 2/2] arm: lib: xor-neon: disable clang vectorization To: Arvind Sankar Cc: Adrian Ratiu , Nathan Chancellor , Arnd Bergmann , Linux ARM , clang-built-linux , Russell King , LKML , Collabora Kernel ML , Ard Biesheuvel Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 10, 2020 at 2:15 PM Arvind Sankar wrote: > > On Tue, Nov 10, 2020 at 01:41:17PM -0800, Nick Desaulniers wrote: > > On Mon, Nov 9, 2020 at 11:51 AM Adrian Ratiu wrote: > > > > > > On Fri, 06 Nov 2020, Nick Desaulniers > > > wrote: > > > > +#pragma clang loop vectorize(enable) > > > > do { > > > > p1[0] ^= p2[0] ^ p3[0] ^ p4[0] ^ p5[0]; p1[1] ^= > > > > p2[1] ^ p3[1] ^ p4[1] ^ p5[1]; > > > > ``` seems to generate the vectorized code. > > > > > > > > Why don't we find a way to make those pragma's more toolchain > > > > portable, rather than open coding them like I have above rather > > > > than this series? > > > > > > Hi again Nick, > > > > > > How did you verify the above pragmas generate correct vectorized > > > code? Have you tested this specific use case? > > > > I read the disassembly before and after my suggested use of pragmas; > > look for vld/vstr. You can also add -Rpass-missed=loop-vectorize to > > CFLAGS_xor-neon.o in arch/arm/lib/Makefile and rebuild > > arch/arm/lib/xor-neon.o with CONFIG_BTRFS enabled. > > > > https://godbolt.org/z/1oo9M6 > > With the __restrict__ keywords added, clang seems to vectorize the loop, > but still reports that vectorization wasn't beneficial -- any idea > what's going on? I suspect that loop-vectorize is a higher level pass that relies on slp-vectorizer for the transform. $ clang -O2 --target=arm-linux-gnueabi -S -o - foo.c -mfpu=neon -mllvm -print-after-all ... *** IR Dump After SLP Vectorizer *** (bunch of <4 x i32> types) If you add -Rpass-missed=slp-vectorizer, observe that the existing warnings from -Rpass-missed=loop-vectorize disappear; I suspect loop-vectorize will print a "remark" if passes it calls did not, but returned some for of error code. -Rpass=slp-vectorizer shows that it vectorizes two sequences of the loop, and warns that some third portion (that's non-immediately-obvious to me) was non beneficial. -- Thanks, ~Nick Desaulniers